Scaling LongRange Forces on Massively Parallel Systems

D. F. Richards, J. N. Glosli, B. Chan, M. R. Dorr, E. W. Draeger, J.-L. Fattebert, W. D. Krauss, T. Spelce, F. H. Streitz, M. P. Surh; Lawrence Livermore National Laboratory, Livermore, CA
J. A. Gunnels; IBM Corporation, Yorktown Heights, NY

With supercomputers anticipated to expand from thousands to millions of cores, one of the challenges facing scientists is how to effectively utilize the everincreasing number of tasks. We report here a scalable approach to computing longrange forces using the Particle-Particle Particle-Mesh Method (PPPM) that creates a heterogeneous decomposition by partitioning effort according to the scaling properties of the component algorithms. In the PPPM method the force calculation is divided into an explicit short-range piece and long-range piece that is solved in Fourier space. These two pieces have different scaling properties and can be calculated independently. Our strategy heterogeneously decomposes the 262,144 cpus on Jugene so that only a fraction (16,384) perform FFTs, while the rest concurrently calculate the highly scalable short-ranged part of the interaction. By paying particular attention to load balancing between the disparate pieces, we achieve an performance in excess of 300 TFlop/s and an improvement of more that 25x in particle positions updated per second compared to previously reported results from other codes.

We have demonstrated our strategy by performing a 140 million-particle MD simulation of stopping power in a hot, dense plasma and benchmark calculations with over 11 billion particles. With this unprecedented simulation capability we are beginning an investigation of plasma properties under conditions where both theory and experiment are lacking-in the strongly-coupled regime as the plasma begins to burn.

Our strategy is applicable to other problems involving long-range forces (i.e., biological or astrophysical simulations). More broadly we believe that the general approach of heterogeneous decomposition will allow many problems to scale across current and next-generation machines.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

ScicomP logo