Hi everybody. I've been experimenting with REMD for my system running on 48 cores with 4 gpus (I will need to scale up to 73 replicas because this is a complicated system with many DOF I'm open to being told this is all a silly idea).
My run configuration is mpirun -np 4 --map-by numa gmx_mpi mdrun -cpi memb_prod1.cpt -ntomp 11 -v -deffnm memb_prod1 -multidir 1 2 3 4 -replex 1000 the best I can squeeze out of this is 9ns/day. In a non-replica simulation I can hit 50ns/day with a single GPU and 12 cores. Looking at my accounting, for a single replica 52% of time is being spent on the "Force" category with 92% of my Mflops going into NxN Ewald Elec. + LJ [F] I'm wondering what I could do to reduce this bottle neck if anything. Thank you. -- Miro A. Astore (he/him) PhD Candidate | Computational Biophysics Office 434 A28 School of Physics University of Sydney -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.