Re: [gmx-users] [Gromacs 3.3.3] tests for parallel - is this reasonable?

Mark Abraham Mon, 05 Oct 2009 04:44:03 -0700

Thomas Schlesier wrote:

Hi all,

Why use 3.3.3? For most purposes the most recent version is morereliable and much faster.

i have done some small tests for parallel calculations with differentsystems:
All simulations were done on my laptop which has a dualcore CPU.

(1) 895 waters - 2685 atoms, for 50000 steps (100ps)
cutoffs 1.0nm (no pme); 3nm cubic box

Nobody uses cutoffs any more. Test with the method you'll use in realcalculation - PME.

single:
              NODE (s)   Real (s)      (%)
      Time:    221.760    222.000     99.9
                      3:41
              (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     13.913      3.648     38.961      0.616
parallel (2 cores):
              NODE (s)   Real (s)      (%)
      Time:    160.000    160.000    100.0
                      2:40
              (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     19.283      5.056     54.000      0.444
Total Scaling: 98% of max performance

=> 1.386 times faster

(2) 3009 waters - 9027 atoms, for 50000 steps (100ps)
cutoffs 1.0nm (no pme); 4.5nm cubic box
single:
              NODE (s)   Real (s)      (%)
      Time:    747.830    751.000     99.6
                      12:27
              (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     13.819      3.617     11.553      2.077
parallel (2cores):
              NODE (s)   Real (s)      (%)
      Time:    525.000    525.000    100.0
                      8:45
              (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     19.684      5.154     16.457      1.458
Total Scaling: 98% of max performance

=> 1.424 times faster

(3) 2 waters
rest same as (1)
single:
              NODE (s)   Real (s)      (%)
      Time:      0.680      1.000     68.0
              (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.012    167.973  12705.884      0.002
parallel:
              NODE (s)   Real (s)      (%)
      Time:      9.000      9.000    100.0
              (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.003     17.870    960.000      0.025
Total Scaling: 88% of max performance

=> about 10 times slower

(this one was more a test to see how the values look for a case whereparallelisation is a waste)


So now my questions:

1) Are the values reasonable (i mean not really each value, but more thespeed difference between parallel and single)? I would have assumed thatif the system is big (2) i'm with two cores about a factor of a littlebit less then 2 faster, and not only around 1.4 times

It depends on a whole pile of factors. Are your cores real or onlyhyperthreads? Do they share caches? I/O systems? Can MPI use the cachefor communication or does it have to write through to main memory? Howbig are the caches? People's laptops that were designed for websurfingand editing Word documents often skimp on stuff that is necessary ifyou're actually planning to keep your floating point units saturatedwith work... You may like to run two copies of the same single-processorjob to get a handle on these issues.

2) In the md0.log files (for parallel runs) i have seen for all threesimulations the following line:
"Load imbalance reduced performance to 200% of max"
What does it mean? And why is it in all three cases the same?


Dunno, probably buggy.

3) What does the "Total Scaling" mean? In case (3) i'm with single 10times better, but for parallel it says i have 88% of max performance (Ifi set single to 100%, it would only be 10% performance).

The spatial decomposition will not always lead to even load balance.That's life. The domain decomposition in 4.x will do a much better job,though it's probably not going to matter much for 2 cores on smallsystems and short runs.


Mark
_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!

Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] [Gromacs 3.3.3] tests for parallel - is this reasonable?

Reply via email to