> > email message attachment > > > -------- Forwarded Message -------- > > From: Nikos Papadimitriou <nik...@ipta.demokritos.gr> > > To: gmx-users@gromacs.org > > Subject: [gmx-users] Gromacs 4.5.4 on multi-node cluster > > Date: Wed, 7 Dec 2011 16:26:46 +0200 > > > > Dear All, > > > > I had been running Gromacs 4.0.7 on a 12-node cluster (Intel i7-920 > > 4-cores) with OS Rocks 5.4.2. Recently, I have upgraded the cluster > > OS to Rocks 5.4.3 and I have installed Gromacs 4.5.4 from the Bio > > Roll repository. When running in parallel on the same node, > > everything works fine. However, when I am trying to run on more than > > one nodes the run stalls immediately with the following message: > > > > [gromacs@tornado Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun > > -np 2 -machinefile > > machines /home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s > > md_run.tpr -o md_traj.trr -c md_confs.gro -e md.edr -g md.log -v > > NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local > > NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local > > NODEID=0 argc=12 > > NODEID=1 argc=12 > > > > The mdrun_mpi thread seems to start in both nodes but the run does > > not go on and no file is produced. It seems that the nodes are > > waiting for some kind of communication between them. The problem > > occurs even for the simplest case (i.e. NVT simulation of 1000 Argon > > atoms without Coulombic interactions). Openmpi and networking > > between the nodes seem to work fine since there are not any problems > > with other software that run with MPI. > > > > In an attempt to find a solution, I have manually compiled and > > installed Gromacs 4.5.5 (with --enable-mpi) after having installed > > the latest version of openmpi and fftw3 and no error occurred during > > the installation. However, when trying to run on two different nodes > > exactly the same problem appears. > > > > Have you any idea what might cause this situation? > > Thank you in advance! > > email message attachment > > > -------- Forwarded Message -------- > > From: Mark Abraham <mark.abra...@anu.edu.au> > > Reply-to: "Discussion list for GROMACS users" > > <gmx-users@gromacs.org> > > To: Discussion list for GROMACS users <gmx-users@gromacs.org> > > Subject: [gmx-users] Gromacs 4.5.4 on multi-node cluster > > Date: Wed, 7 Dec 2011 16:53:49 +0200 > > > > On 8/12/2011 1:26 AM, Nikos Papadimitriou wrote: > > > > > Dear All, > > > > > > I had been running Gromacs 4.0.7 on a 12-node cluster (Intel > > > i7-920 4-cores) with OS Rocks 5.4.2. Recently, I have upgraded the > > > cluster OS to Rocks 5.4.3 and I have installed Gromacs 4.5.4 from > > > the Bio Roll repository. When running in parallel on the same > > > node, everything works fine. However, when I am trying to run on > > > more than one nodes the run stalls immediately with the following > > > message: > > > > > > [gromacs@tornado > > > Test]$ /home/gromacs/.Installed/openmpi/bin/mpirun -np 2 > > > -machinefile > > > machines /home/gromacs/.Installed/gromacs/bin/mdrun_mpi -s > > > md_run.tpr -o md_traj.trr -c md_confs.gro -e md.edr -g md.log -v > > > NNODES=2, MYRANK=0, HOSTNAME=compute-1-1.local > > > NNODES=2, MYRANK=1, HOSTNAME=compute-1-2.local > > > NODEID=0 argc=12 > > > NODEID=1 argc=12 > > > > > > The mdrun_mpi thread seems to start in both nodes but the run does > > > not go on and no file is produced. It seems that the nodes are > > > waiting for some kind of communication between them. The problem > > > occurs even for the simplest case (i.e. NVT simulation of 1000 > > > Argon atoms without Coulombic interactions). Openmpi and > > > networking between the nodes seem to work fine since there are not > > > any problems with other software that run with MPI. > > > > > > Can you run 2-processor MPI test program with that machine file? > > > > Mark > >
"Unfortunately", other MPI programs run fine on 2 or more nodes. There seems to be no problem with MPI. > > > > > > In an attempt to find a solution, I have manually compiled and > > > installed Gromacs 4.5.5 (with --enable-mpi) after having installed > > > the latest version of openmpi and fftw3 and no error occurred > > > during the installation. However, when trying to run on two > > > different nodes exactly the same problem appears. > > > > > > Have you any idea what might cause this situation? > > > Thank you in advance! > > >
-- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists