Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

gigo Fri, 19 Jul 2013 10:41:27 -0700

Hi!

On 2013-07-17 21:08, Mark Abraham wrote:

You tried ppn3 (with and without --loadbalance)?


I was testing on 8-replicas simulation.

1) Without --loadbalance and -np 8.
Excerpts from the script:
#PBS -l nodes=8:ppn=3
setenv OMP_NUM_THREADS 4

mpiexec mdrun_mpi -v -cpt 20 -multi 8 -ntomp 4 -replex 2500 -cpi -pinon


Excerpts from logs:
Using 3 MPI processes
Using 4 OpenMP threads per MPI process
(...)
Overriding thread affinity set outside mdrun_mpi

Pinning threads with an auto-selected logical core stride of 1

WARNING: In MPI process #0: Affinity setting for 1/4 threads failed.

This can cause performance degradation! If you think yoursetting are

         correct, contact the GROMACS developers.


WARNING: In MPI process #2: Affinity setting for 4/4 threads failed.

Load: The job was allocated 24 cores (3 cores on 8 different nodes).Each OpenMP thread uses ~1/3 of a CPU core on average.Conclusions: MPI runs as many processes as cores requested(nnodes*ppn=24), it ignores OMP_NUM_THREADS env ==> this is wrong and isnot Gromacs issue. Each MPI process forks to 4 threads as requested. The24-core limit granted by Torque is not violated.

2) The same script, but with -np 8, to limit the number of MPIprocesses to the number of replicas

Logs:
Using 1 MPI process
Using 4 OpenMP threads
(...)

Replicas 0,3 and 6: WARNING: Affinity setting for 1/4 threads failed.
Replicas 1,2,4,5,7: WARNING: Affinity setting for 4/4 threads failed.

Load: The job was allocated 24 cores on 8 nodes. Only on first 3 nodesmpiexec was run. Each OpenMP thread uses ~20% of a CPU core.


3) -np 8 --loadbalance
Excerpts from logs:
Using 1 MPI process
Using 4 OpenMP threads
(...)
Each replica says: WARNING: Affinity setting for 3/4 threads failed.

Load: MPI processes spread evenly on all 8 nodes. Each OpenMP threaduses ~50% of a CPU core.

4) -np 8 --loadbalance, #PBS -l nodes=8:ppn=4 <== this worked ~OK withgromacs 4.6.2

Logs:
WARNING: Affinity setting for 2/4 threads failed.

Load: 32 cores allocated on 8 nodes. MPI processes spread evenly, eachOpenMP thread uses ~70% of a CPU core.With 144 replicas the simulation did not produce any results, just gotstuck.

Some thoughts: the main problem is most probably in the way MPIinterprets the information from torque, it is not Gromacs related. MPIignores OMP_NUM_THREADS. The environment is just broken. Sincegromacs-4.6.2 behaved better than 4.6.3 there, I am coming back to it.

Best,
G

Mark

On Wed, Jul 17, 2013 at 6:30 PM, gigo <g...@ibb.waw.pl> wrote:
On 2013-07-13 11:10, Mark Abraham wrote:
On Sat, Jul 13, 2013 at 1:24 AM, gigo <g...@ibb.waw.pl> wrote:
On 2013-07-12 20:00, Mark Abraham wrote:
On Fri, Jul 12, 2013 at 4:27 PM, gigo <g...@ibb.waw.pl> wrote:
Hi!

On 2013-07-12 11:15, Mark Abraham wrote:
What does --loadbalance do?
It balances the total number of processes across all allocatednodes.
OK, but using it means you are hostage to its assumptions aboutbalance.
Thats true, but as long as I do not try to use more resources thatthetorque gives me, everything is OK. The question is, what is aproper way
of
running multiple simulations in parallel with MPI that are further
parallelized with OpenMP, when pinning fails? I could not find anyother.
I think pinning fails because you are double-crossing yourself. Youdo
not want 12 MPI processes per node, and that is likely what ppn is
setting. AFAIK your setup should work, but I haven't tested it.
The
thing is that mpiexec does not know that I want each replica tofork to
4
OpenMP threads. Thus, without this option and without affinities(in a
sec
about it) mpiexec starts too many replicas on some nodes -gromacs
complains
about the overload then - while some cores on other nodes are notused.
It
is possible to run my simulation like that:
mpiexec mdrun_mpi -v -cpt 20 -multi 144 -replex 2000 -cpi(without
--loadbalance for mpiexec and without -ntomp for mdrun)
Then each replica runs on 4 MPI processes (I allocate 4 timesmore
cores
then replicas and mdrun sees it). The problem is that it is muchslower
than
using OpenMP for each replica. I did not find any other way than
--loadbalance in mpiexec and then -multi 144 -ntomp 4 in mdrun touse
MPI
and OpenMP at the same time on the torque-controlled cluster.
That seems highly surprising. I have not yet encountered a job
scheduler that was completely lacking a "do what I tell you"layoutscheme. More importantly, why are you using #PBS -lnodes=48:ppn=12?
I thing that torque is very similar to all PBS-like resourcemanagers inthis regard. It actually does what I tell it to do. There are12-core
nodes,
I ask for 48 of them - I get them (simple #PBS -l ncpus=576 doesnot
work),
end of story. Now, the program that I run is responsible forpopulating
resources that I got.
No, that's not the end of the story. The scheduler and the MPIsystemtypically cooperate to populate the MPI processes on the hardware,setOMP_NUM_THREADS, set affinities, etc. mdrun honours those if theyare
set.
I was able to run what I wanted flawlessly on another cluster withPBS-Pro.The torque cluster seem to work like I said ("the end of story"behaviour).REMD runs well on torque when I give a whole physical node to onereplica.
Otherwise the simulation does not go or the pinning fails (sometimes
partially). I run out of options, I did not find any working
example/documentation on running hybrid MPI/OpenMP jobs in torque. Itseemsthat I stumbled upon limitations of this resource manager, and it isnot
really the Gromacs issue.
Best Regards,
Grzegorz
You seem to be using 12 because you know there are 12 cores pernode.The scheduler should know that already. ppn should be a commandaboutwhat to do with the hardware, not a description of what it is. Moreto
the point, you should read the docs and be sure what it does.
Surely you want 3 MPI processes per 12-core node?
Yes - I want each node to run 3 MPI processes. Preferably, I wouldlike
to
run each MPI process on separate node (spread on 12 cores withOpenMP)
but I
will not get as much of resources. But again, without the--loadbalance
hack
I would not be able to properly populate the nodes...
So try ppn 3!
What do the .log files say about
OMP_NUM_THREADS, thread affinities, pinning, etc?
Each replica logs:
"Using 1 MPI process
Using 4 OpenMP threads",
That is is correct. As I said, the threads are forked, but 3 outof 4
don't
do anything, and the simulation does not go at all.

About affinities Gromacs says:
"Can not set thread affinities on the current platform. On NUMAsystems
this
can cause performance degradation. If you think your platformshould
support
setting affinities, contact the GROMACS developers."
Well, the "current platform" is normal x86_64 cluster, but thewhole
information about resources is passed by Torque to OpenMPI-linked
Gromacs.
Can it be that mdrun sees the resources allocated by torque as abig
pool
of
cpus and misses the information about nodes topology?
mdrun gets its processor topology from the MPI layer, so that iswhereyou need to focus. The error message confirms that GROMACS seesthings
that seem wrong.
Thank you, I will take a look. But the first thing I want to do is
finding
the reason why Gromacs 4.6.3 is not able to run on my (slightlyweird, I
admit) setup, while 4.6.2 does it very well.
4.6.2 had a bug that inhibited any MPI-based mdrun from attemptingto
set affinities. It's still not clear why ppn 12 worked at all.
Apparently mdrun was able to float some processes around to get
something that worked. The good news is that when you get it working
in 4.6.3, you will see a performance boost.

Mark
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the www
interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!

* Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

Reply via email to