Hi, That's still likely disastrous for performance. Mdrun uses all the cores of the CPU that you permit, as well as the GPU, and running two mdrun on the same cores risks a super-linear slowdown. See suggested examples at http://manual.gromacs.org/documentation/2016.1/user-guide/mdrun-performance.html#examples-for-mdrun-on-one-node
Mark On Mon, 9 Jan 2017 16:12 Natalie Tatum <nataliejta...@gmail.com> wrote: > Dear Justin, > > Thanks for the advice - after a clean up, a reboot, and some careful > application of commands, everything seems to be running nicely again. > Switching the call to below (instead of using -deffnm) is working. > > gmx mdrun -s md.tpr -gpu_id 1 & > > Many thanks, > > Natalie > > > > > On 4 January 2017 at 01:02, Justin Lemkul <jalem...@vt.edu> wrote: > > > > > > > On 1/3/17 10:43 AM, Natalie Tatum wrote: > > > >> Dear all, > >> > >> I'm hoping you can shed light on (a) what my mdrun problem is and (b) > >> where > >> to start fixing it. > >> > >> I'm simulating different mutants of a protein dimer on DNA, for 10 ns > >> a-piece. I have successfully run this protocol on the wild-type protein, > >> on > >> two single residue mutants, and on a double mutant. I came to run the > same > >> on a fourth, single site mutant. I have followed the same protocols and > >> utilised the same MDP settings throughout. All were subject to 5000 > steps > >> of steepest-descent energy minimisation, then 200 ps of equilibration in > >> the NVT ensemble, then the same in the NPT. For this particular mutant > >> there were no issues apparent going into production MD. Therefore, I > don't > >> think it's an issue of my MDP setup or system... > >> > >> So I have two compatible (OpenCL 1.2) AMD Radeon HD Firepro D300 GPUs, > and > >> I have one mutant (run/process) assigned to each. > >> > >> For this mutant I call mdrun with: > >> > >> gmx mdrun -deffnm md -gpu_id 1 & > >> > >> Whereas the other is on -gpu_id 0, and walk away. This worked > successfully > >> in the week prior for two other systems. It's New Year, then I come back > >> to > >> what should be completed simulations this morning to get my hands dirty > in > >> analysis. > >> > >> Run on gpu 0 has completed successfully, all is grand. > >> > >> Mutant on gpu 1 has not. Attempts to resume/restart fail (on either GPU, > >> or > >> both, or calling neither explicitly). All output looks like this: > >> > >> GROMACS: gmx mdrun, VERSION 5.1.3 > >> > >> Executable: /usr/local/gromacs/bin/gmx > >> > >> Data prefix: /usr/local/gromacs > >> > >> Command line: > >> > >> > >> > >> gmx mdrun -deffnm md > >> > >> > > From the .log, it appears your command was not what you think it was. Is > > it possible that the job failed because mdrun tried to consume all > > available hardware and got hung up? > > > > > >> > >> GROMACS version: VERSION 5.1.3 > >> > >> Precision: single > >> > >> Memory model: 64 bit > >> > >> MPI library: thread_mpi > >> > >> OpenMP support: disabled > >> > >> GPU support: enabled > >> > >> OpenCL support: enabled > >> > >> invsqrt routine: gmx_software_invsqrt(x) > >> > >> SIMD instructions: AVX_256 > >> > >> FFT library: fftw-3.3.4-sse2 > >> > >> RDTSCP usage: enabled > >> > >> C++11 compilation: disabled > >> > >> TNG support: enabled > >> > >> Tracing support: disabled > >> > >> Built on: Mon 1 Aug 2016 17:20:18 BST > >> > >> Built by: natalie@t <nata...@nicr00353.ncl.ac.uk> > >> hemachineIuse.here.there [CMAKE] > >> > >> > >> Build OS/arch: Darwin 15.5.0 x86_64 > >> > >> Build CPU vendor: GenuineIntel > >> > >> Build CPU brand: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz > >> > >> Build CPU family: 6 Model: 62 Stepping: 4 > >> > >> Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm > mmx > >> msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 > >> sse3 sse4.1 sse4.2 ssse3 tdt x2apic > >> > >> C compiler: /Applications/Xcode.app/Conte > >> nts/Developer/Toolchains/ > >> XcodeDefault.xctoolchain/usr/bin/cc Clang 7.3.0.7030031 > >> > >> C compiler flags: -mavx -Wall -Wno-unused -Wunused-value > >> -Wunused-parameter -Wno-unknown-pragmas -O3 -DNDEBUG > >> > >> C++ compiler: /Applications/Xcode.app/Conte > >> nts/Developer/Toolchains/ > >> XcodeDefault.xctoolchain/usr/bin/c++ Clang 7.3.0.7030031 > >> > >> C++ compiler flags: -mavx -Wextra -Wno-missing-field-initializers > >> -Wpointer-arith -Wall -Wno-unused-function -Wno-unknown-pragmas -O3 > >> -DNDEBUG > >> > >> Boost version: 1.60.0 (external) > >> > >> OpenCL include dir: /System/Library/Frameworks/OpenCL.framework > >> > >> OpenCL library: /System/Library/Frameworks/OPENCL.framework > >> > >> OpenCL version: 1.2 > >> > >> > >> And there it ends. No files except the log shown above - and though this > >> initial output looks identical in content to the beginnings of logs for > >> successful simulations, mdrun does not then seem to engage with the > >> GPU/CPUs available. > >> > >> There are no error messages, no apparent indication as to where this has > >> gone wrong... And now I can't run mdrun at all, for any system. > >> > >> > > Test whether or not your GPU is still accessible and capable of running > > test programs. > > > > -Justin > > > > I've checked my disk space (fine, >100 GB available), I'm able to call > and > >> execute other gmx commands, but mdrun does the above. > >> > >> The closest error I can find with my google-fu is three years ago where > >> this user ( > >> http://gromacs.org_gmx-users.maillist.sys.kth.narkive.com/FE > >> dWd6gC/mdrun-no-error-but-hangs-no-results > >> ) got no error but a killed process, but I don't even get as far as > >> detection of CPUs/GPUs or domain decomposition. > >> > >> Any suggestions much appreciated, > >> > >> Natalie > >> > >> > > -- > > ================================================== > > > > Justin A. Lemkul, Ph.D. > > Ruth L. Kirschstein NRSA Postdoctoral Fellow > > > > Department of Pharmaceutical Sciences > > School of Pharmacy > > Health Sciences Facility II, Room 629 > > University of Maryland, Baltimore > > 20 Penn St. > > Baltimore, MD 21201 > > > > jalem...@outerbanks.umaryland.edu | (410) 706-7441 > > http://mackerell.umaryland.edu/~jalemkul > > > > ================================================== > > -- > > Gromacs Users mailing list > > > > * Please search the archive at http://www.gromacs.org/Support > > /Mailing_Lists/GMX-Users_List before posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > send a mail to gmx-users-requ...@gromacs.org. > > > > > > -- > *Dr. Natalie J. Tatum* > Post-doctoral Research Associate > Northern Institute for Cancer Research > Newcastle University > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.