You seem to be using a relatively large number of GPUs. May want to check your input data (many cases will not scale well, but ensemble runs can be quite common). Perhaps check speedup in going from 1 to 2 to 4 GPUs on one node.

On 3/9/19 12:11 AM, Carlos Rivas wrote:
Hey guys,
Anybody running GROMACS on AWS ?

I have a strong IT background , but zero understanding of GROMACS or OpenMPI. ( 
even less using sge on AWS ),
Just trying to help some PHD Folks with their work.

When I run gromacs using Thread-mpi on a single, very large node on AWS things 
work fairly fast.
However, when I switch from thread-mpi to OpenMPI even though everything's 
detected properly, the performance is horrible.

This is what I am submitting to sge:

ubuntu@ip-10-10-5-81:/shared/charmm-gui/gromacs$ cat sge.sh
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -e out.err
#$ -o out.out
#$ -pe mpi 256

cd /shared/charmm-gui/gromacs
touch start.txt
/bin/bash /shared/charmm-gui/gromacs/run_eq.bash
touch end.txt

and this is my test script , provided by one of the Doctors:

ubuntu@ip-10-10-5-81:/shared/charmm-gui/gromacs$ cat run_eq.bash
#!/bin/bash
export GMXMPI="/usr/bin/mpirun --mca btl ^openib 
/shared/gromacs/5.1.5/bin/gmx_mpi"

export MDRUN="mdrun -ntomp 2 -npme 32"

export GMX="/shared/gromacs/5.1.5/bin/gmx_mpi"

for comm in min eq; do
if [ $comm == min ]; then
    echo ${comm}
    $GMX grompp -f step6.0_minimization.mdp -o step6.0_minimization.tpr -c 
step5_charmm2gmx.pdb -p topol.top
    $GMXMPI $MDRUN -deffnm step6.0_minimization

fi

if [ $comm == eq ]; then
   for step in `seq 1 6`;do
    echo $step
    if [ $step -eq 1 ]; then
       echo ${step}
       $GMX grompp -f step6.${step}_equilibration.mdp -o 
step6.${step}_equilibration.tpr -c step6.0_minimization.gro -r 
step5_charmm2gmx.pdb -n index.ndx -p topol.top
       $GMXMPI $MDRUN -deffnm step6.${step}_equilibration
    fi
    if [ $step -gt 1 ]; then
       old=`expr $step - 1`
       echo $old
       $GMX grompp -f step6.${step}_equilibration.mdp -o 
step6.${step}_equilibration.tpr -c step6.${old}_equilibration.gro -r 
step5_charmm2gmx.pdb -n index.ndx -p topol.top
       $GMXMPI $MDRUN -deffnm step6.${step}_equilibration
    fi
   done
fi
done




during the output, I see this , and I get really excited, expecting blazing 
speeds and yet, it's much worse than a single node:

Command line:
   gmx_mpi mdrun -ntomp 2 -npme 32 -deffnm step6.0_minimization


Back Off! I just backed up step6.0_minimization.log to 
./#step6.0_minimization.log.6#

Running on 4 nodes with total 128 cores, 256 logical cores, 32 compatible GPUs
   Cores per node:           32
   Logical cores per node:   64
   Compatible GPUs per node:  8
   All nodes have identical type(s) of GPUs
Hardware detected on host ip-10-10-5-89 (the node of MPI rank 0):
   CPU info:
     Vendor: GenuineIntel
     Brand:  Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
     SIMD instructions most likely to fit this hardware: AVX2_256
     SIMD instructions selected at GROMACS compile time: AVX2_256
   GPU info:
     Number of GPUs detected: 8
     #0: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #1: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #2: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #3: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #4: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #5: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #6: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible
     #7: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat: 
compatible

Reading file step6.0_minimization.tpr, VERSION 5.1.5 (single precision)
Using 256 MPI processes
Using 2 OpenMP threads per MPI process

On host ip-10-10-5-89 8 compatible GPUs are present, with IDs 0,1,2,3,4,5,6,7
On host ip-10-10-5-89 8 GPUs auto-selected for this run.
Mapping of GPU IDs to the 56 PP ranks in this node: 
0,0,0,0,0,0,0,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,5,5,5,5,5,5,5,6,6,6,6,6,6,6,7,7,7,7,7,7,7



Any suggestions? Greatly appreciate the help.


Carlos J. Rivas
Senior AWS Solutions Architect - Migration Specialist

--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Reply via email to