Hi,

That's still likely disastrous for performance. Mdrun uses all the cores of
the CPU that you permit, as well as the GPU, and running two mdrun on the
same cores risks a super-linear slowdown. See suggested examples at
http://manual.gromacs.org/documentation/2016.1/user-guide/mdrun-performance.html#examples-for-mdrun-on-one-node

Mark

On Mon, 9 Jan 2017 16:12 Natalie Tatum <nataliejta...@gmail.com> wrote:

> Dear Justin,
>
> Thanks for the advice - after a clean up, a reboot, and some careful
> application of commands, everything seems to be running nicely again.
> Switching the call to below (instead of using -deffnm) is working.
>
> gmx mdrun -s md.tpr -gpu_id 1 &
>
> Many thanks,
>
> Natalie
>
>
>
>
> On 4 January 2017 at 01:02, Justin Lemkul <jalem...@vt.edu> wrote:
>
> >
> >
> > On 1/3/17 10:43 AM, Natalie Tatum wrote:
> >
> >> Dear all,
> >>
> >> I'm hoping you can shed light on (a) what my mdrun problem is and (b)
> >> where
> >> to start fixing it.
> >>
> >> I'm simulating different mutants of a protein dimer on DNA, for 10 ns
> >> a-piece. I have successfully run this protocol on the wild-type protein,
> >> on
> >> two single residue mutants, and on a double mutant. I came to run the
> same
> >> on a fourth, single site mutant. I have followed the same protocols and
> >> utilised the same MDP settings throughout. All were subject to 5000
> steps
> >> of steepest-descent energy minimisation, then 200 ps of equilibration in
> >> the NVT ensemble, then the same in the NPT. For this particular mutant
> >> there were no issues apparent going into production MD. Therefore, I
> don't
> >> think it's an issue of my MDP setup or system...
> >>
> >> So I have two compatible (OpenCL 1.2) AMD Radeon HD Firepro D300 GPUs,
> and
> >> I have one mutant (run/process) assigned to each.
> >>
> >> For this mutant I call mdrun with:
> >>
> >> gmx mdrun -deffnm md -gpu_id 1 &
> >>
> >> Whereas the other is on -gpu_id 0, and walk away. This worked
> successfully
> >> in the week prior for two other systems. It's New Year, then I come back
> >> to
> >> what should be completed simulations this morning to get my hands dirty
> in
> >> analysis.
> >>
> >> Run on gpu 0 has completed successfully, all is grand.
> >>
> >> Mutant on gpu 1 has not. Attempts to resume/restart fail (on either GPU,
> >> or
> >> both, or calling neither explicitly). All output looks like this:
> >>
> >> GROMACS:      gmx mdrun, VERSION 5.1.3
> >>
> >> Executable:   /usr/local/gromacs/bin/gmx
> >>
> >> Data prefix:  /usr/local/gromacs
> >>
> >> Command line:
> >>
> >>
> >>
> >>   gmx mdrun -deffnm md
> >>
> >>
> > From the .log, it appears your command was not what you think it was.  Is
> > it possible that the job failed because mdrun tried to consume all
> > available hardware and got hung up?
> >
> >
> >>
> >> GROMACS version:    VERSION 5.1.3
> >>
> >> Precision:          single
> >>
> >> Memory model:       64 bit
> >>
> >> MPI library:        thread_mpi
> >>
> >> OpenMP support:     disabled
> >>
> >> GPU support:        enabled
> >>
> >> OpenCL support:     enabled
> >>
> >> invsqrt routine:    gmx_software_invsqrt(x)
> >>
> >> SIMD instructions:  AVX_256
> >>
> >> FFT library:        fftw-3.3.4-sse2
> >>
> >> RDTSCP usage:       enabled
> >>
> >> C++11 compilation:  disabled
> >>
> >> TNG support:        enabled
> >>
> >> Tracing support:    disabled
> >>
> >> Built on:           Mon  1 Aug 2016 17:20:18 BST
> >>
> >> Built by:           natalie@t <nata...@nicr00353.ncl.ac.uk>
> >> hemachineIuse.here.there [CMAKE]
> >>
> >>
> >> Build OS/arch:      Darwin 15.5.0 x86_64
> >>
> >> Build CPU vendor:   GenuineIntel
> >>
> >> Build CPU brand:    Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
> >>
> >> Build CPU family:   6   Model: 62   Stepping: 4
> >>
> >> Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm
> mmx
> >> msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2
> >> sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> >>
> >> C compiler:         /Applications/Xcode.app/Conte
> >> nts/Developer/Toolchains/
> >> XcodeDefault.xctoolchain/usr/bin/cc Clang 7.3.0.7030031
> >>
> >> C compiler flags:    -mavx    -Wall -Wno-unused -Wunused-value
> >> -Wunused-parameter -Wno-unknown-pragmas  -O3 -DNDEBUG
> >>
> >> C++ compiler:       /Applications/Xcode.app/Conte
> >> nts/Developer/Toolchains/
> >> XcodeDefault.xctoolchain/usr/bin/c++ Clang 7.3.0.7030031
> >>
> >> C++ compiler flags:  -mavx    -Wextra -Wno-missing-field-initializers
> >> -Wpointer-arith -Wall -Wno-unused-function -Wno-unknown-pragmas  -O3
> >> -DNDEBUG
> >>
> >> Boost version:      1.60.0 (external)
> >>
> >> OpenCL include dir: /System/Library/Frameworks/OpenCL.framework
> >>
> >> OpenCL library:     /System/Library/Frameworks/OPENCL.framework
> >>
> >> OpenCL version:     1.2
> >>
> >>
> >> And there it ends. No files except the log shown above - and though this
> >> initial output looks identical in content to the beginnings of logs for
> >> successful simulations, mdrun does not then seem to engage with the
> >> GPU/CPUs available.
> >>
> >> There are no error messages, no apparent indication as to where this has
> >> gone wrong... And now I can't run mdrun at all, for any system.
> >>
> >>
> > Test whether or not your GPU is still accessible and capable of running
> > test programs.
> >
> > -Justin
> >
> > I've checked my disk space (fine, >100 GB available), I'm able to call
> and
> >> execute other gmx commands, but mdrun does the above.
> >>
> >> The closest error I can find with my google-fu is three years ago where
> >> this user (
> >> http://gromacs.org_gmx-users.maillist.sys.kth.narkive.com/FE
> >> dWd6gC/mdrun-no-error-but-hangs-no-results
> >> ) got no error but a killed process, but I don't even get as far as
> >> detection of CPUs/GPUs or domain decomposition.
> >>
> >> Any suggestions much appreciated,
> >>
> >> Natalie
> >>
> >>
> > --
> > ==================================================
> >
> > Justin A. Lemkul, Ph.D.
> > Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >
> > Department of Pharmaceutical Sciences
> > School of Pharmacy
> > Health Sciences Facility II, Room 629
> > University of Maryland, Baltimore
> > 20 Penn St.
> > Baltimore, MD 21201
> >
> > jalem...@outerbanks.umaryland.edu | (410) 706-7441
> > http://mackerell.umaryland.edu/~jalemkul
> >
> > ==================================================
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at http://www.gromacs.org/Support
> > /Mailing_Lists/GMX-Users_List before posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> >
>
>
>
> --
> *Dr. Natalie J. Tatum*
> Post-doctoral Research Associate
> Northern Institute for Cancer Research
> Newcastle University
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Reply via email to