Please upload them to a file-sharing service on the web (there are lots that are free-to-use), and paste the link here.
Mark On Mon, Aug 25, 2014 at 6:07 AM, Yunlong Liu <yliu...@jhmi.edu> wrote: > Hi Szilard, > > I would like to send you the log file and i really need your help. Please > trust me that i have tested many times when i turned on the dlb, the gpu > nodes reported cannot allocate memory error and shut all MPI processes > down. I have to tolerate the large loading imbalance (50%) to run my > simulations. I wish i can figure out some way that makes my simulation run > on GPU and have better performance. > > Where can i post the log file? If i paste it here, it will be really long. > > Yunlong > > > > On Aug 24, 2014, at 2:20 PM, "Szilárd Páll" <pall.szil...@gmail.com> > wrote: > > > >> On Thu, Aug 21, 2014 at 8:25 PM, Yunlong Liu <yliu...@jh.edu> wrote: > >> Hi Roland, > >> > >> I just compiled the latest gromacs-5.0 version released on Jun 29th. I > will > >> recompile it as you suggested by using those Flags. It seems like the > high > >> loading imbalance doesn't affect the performance as well, which is > weird. > > > > How did you draw that conclusion? Please show us log files of the > > respective runs, that will help to assess what is gong on. > > > > -- > > Szilárd > > > >> Thank you. > >> Yunlong > >> > >>> On 8/21/14, 2:13 PM, Roland Schulz wrote: > >>> > >>> Hi, > >>> > >>> > >>> > >>> On Thu, Aug 21, 2014 at 1:56 PM, Yunlong Liu <yliu...@jh.edu > >>> <mailto:yliu...@jh.edu>> wrote: > >>> > >>> Hi Roland, > >>> > >>> The problem I am posting is caused by trivial errors (like not > >>> enough memory) and I think it should be a real bug inside the > >>> gromacs-GPU support code. > >>> > >>> It is unlikely a trivial error because otherwise someone else would > have > >>> noticed. You could try the release-5-0 branch from git, but I'm not > aware of > >>> any bugfixes related to memory allocation. > >>> The memory allocation which causes the error isn't the problem. The > >>> printed size is reasonable. You could recompile with PRINT_ALLOC_KB > (add > >>> -DPRINT_ALLOC_KB to CMAKE_C_FLAGS) and rerun the simulation. It might > tell > >>> you where the usual large memory allocation happens. > >>> > >>> PS: Please don't reply to an individual Gromacs developer. Keep all > >>> conversation on the gmx-users list. > >>> > >>> Roland > >>> > >>> That is the reason why I post this problem to the developer > >>> mailing-list. > >>> > >>> My system contains ~240,000 atoms. It is a rather big protein. The > >>> memory information of the node is : > >>> > >>> top - 12:46:59 up 15 days, 22:18, 1 user, load average: 1.13, > >>> 6.27, 11.28 > >>> Tasks: 510 total, 2 running, 508 sleeping, 0 stopped, 0 zombie > >>> Cpu(s): 6.3%us, 0.0%sy, 0.0%ni, 93.7%id, 0.0%wa, 0.0%hi, > >>> 0.0%si, 0.0%st > >>> Mem: 32815324k total, 4983916k used, 27831408k free, 7984k > >>> buffers > >>> Swap: 4194296k total, 0k used, 4194296k free, 700588k > >>> cached > >>> > >>> I am running the simulation on 2 nodes, 4 MPI ranks and each rank > >>> with 8 OPENMP-threads. I list the information of their CPU and GPU > >>> here: > >>> > >>> c442-702.stampede(1)$ nvidia-smi > >>> Thu Aug 21 12:46:17 2014 > >>> +------------------------------------------------------+ > >>> | NVIDIA-SMI 331.67 Driver Version: 331.67 | > >>> > >>> > |-------------------------------+----------------------+----------------------+ > >>> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile > >>> Uncorr. ECC | > >>> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util > >>> Compute M. | > >>> > >>> > |===============================+======================+======================| > >>> | 0 Tesla K20m Off | 0000:03:00.0 Off > >>> | 0 | > >>> | N/A 22C P0 46W / 225W | 172MiB / 4799MiB | 0% > >>> Default | > >>> > >>> > +-------------------------------+----------------------+----------------------+ > >>> > >>> > >>> > +-----------------------------------------------------------------------------+ > >>> | Compute processes: GPU Memory | > >>> | GPU PID Process name > >>> Usage | > >>> > >>> > |=============================================================================| > >>> | 0 113588 /work/03002/yliu120/gromacs-5/bin/mdrun_mpi 77MiB | > >>> | 0 113589 /work/03002/yliu120/gromacs-5/bin/mdrun_mpi 77MiB | > >>> > >>> > +-----------------------------------------------------------------------------+ > >>> > >>> c442-702.stampede(4)$ lscpu > >>> Architecture: x86_64 > >>> CPU op-mode(s): 32-bit, 64-bit > >>> Byte Order: Little Endian > >>> CPU(s): 16 > >>> On-line CPU(s) list: 0-15 > >>> Thread(s) per core: 1 > >>> Core(s) per socket: 8 > >>> Socket(s): 2 > >>> NUMA node(s): 2 > >>> Vendor ID: GenuineIntel > >>> CPU family: 6 > >>> Model: 45 > >>> Stepping: 7 > >>> CPU MHz: 2701.000 > >>> BogoMIPS: 5399.22 > >>> Virtualization: VT-x > >>> L1d cache: 32K > >>> L1i cache: 32K > >>> L2 cache: 256K > >>> L3 cache: 20480K > >>> NUMA node0 CPU(s): 0-7 > >>> NUMA node1 CPU(s): 8-15 > >>> > >>> I hope this information will help. Thank you. > >>> > >>> Yunlong > >>> > >>> > >>> > >>> > >>> > >>> > >>>> On 8/21/14, 1:38 PM, Roland Schulz wrote: > >>>> > >>>> Hi, > >>>> > >>>> please don't use gmx-developers for user questions. Feel free to > >>>> use it if you want to fix the problem, and have questions about > >>>> implementation details. > >>>> > >>>> Please provide more details: How large is your system? How much > >>>> memory does a node have? On how many nodes do you try to run? How > >>>> many mpi-ranks do you have per node? > >>>> > >>>> Roland > >>>> > >>>> On Thu, Aug 21, 2014 at 12:21 PM, Yunlong Liu <yliu...@jh.edu > >>>> <mailto:yliu...@jh.edu>> wrote: > >>>> > >>>> Hi Gromacs Developers, > >>>> > >>>> I found something about the dynamic loading balance really > >>>> interesting. I am running my simulation on Stampede > >>>> supercomputer, which has nodes with 16-physical core ( really > >>>> 16 Intel Xeon cores on one node ) and an NVIDIA Tesla K20m > >>>> GPU associated. > >>>> > >>>> When I am using only the CPUs, I turned on dynamic loading > >>>> balance by -dlb yes. And it seems to work really good, and > >>>> the loading imbalance is only 1~2%. This really helps improve > >>>> the performance by 5~7%。But when I am running my code on > >>>> GPU-CPU hybrid ( GPU node, 16-cpu and 1 GPU), the dynamic > >>>> loading balance kicked in since the imbalance goes up to ~50% > >>>> instantly after loading. Then the the system reports a > >>>> fail-to-allocate-memory error: > >>>> > >>>> NOTE: Turning on dynamic load balancing > >>>> > >>>> > >>>> ------------------------------------------------------- > >>>> Program mdrun_mpi, VERSION 5.0 > >>>> Source code file: > >>>> > >>>> /home1/03002/yliu120/build/gromacs-5.0/src/gromacs/utility/smalloc.c, > >>>> line: 226 > >>>> > >>>> Fatal error: > >>>> Not enough memory. Failed to realloc 1020720 bytes for > >>>> dest->a, dest->a=d5800030 > >>>> (called from file > >>>> > >>>> /home1/03002/yliu120/build/gromacs-5.0/src/gromacs/mdlib/domdec_top.c, > >>>> line 1061) > >>>> For more information and tips for troubleshooting, please > >>>> check the GROMACS > >>>> website at http://www.gromacs.org/Documentation/Errors > >>>> ------------------------------------------------------- > >>>> : Cannot allocate memory > >>>> Error on rank 0, will try to stop all ranks > >>>> Halting parallel program mdrun_mpi on CPU 0 out of 4 > >>>> > >>>> gcq#274: "I Feel a Great Disturbance in the Force" (The > >>>> Emperor Strikes Back) > >>>> > >>>> [cli_0]: aborting job: > >>>> application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 > >>>> [c442-702.stampede.tacc.utexas.edu:mpispawn_0][readline] > >>>> Unexpected End-Of-File on file descriptor 6. MPI process died? > >>>> [c442-702.stampede.tacc.utexas.edu: > mpispawn_0][mtpmi_processops] > >>>> Error while reading PMI socket. MPI process died? > >>>> [c442-702.stampede.tacc.utexas.edu:mpispawn_0][child_handler] > >>>> MPI process (rank: 0, pid: 112839) exited with status 255 > >>>> TACC: MPI job exited with code: 1 > >>>> > >>>> TACC: Shutdown complete. Exiting. > >>>> > >>>> So I manually turned off the dynamic loading balance by -dlb > >>>> no. The simulation goes through with the very high loading > >>>> imbalance, like: > >>>> > >>>> DD step 139999 load imb.: force 51.3% > >>>> > >>>> Step Time Lambda > >>>> 140000 280.00000 0.00000 > >>>> > >>>> Energies (kJ/mol) > >>>> U-B Proper Dih. Improper Dih. CMAP > >>>> Dih. LJ-14 > >>>> 4.88709e+04 1.21990e+04 2.99128e+03 -1.46719e+03 > >>>> 1.98569e+04 > >>>> Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) > >>>> Coul. recip. > >>>> 2.54663e+05 4.05141e+05 -3.16020e+04 -3.75610e+06 > >>>> 2.24819e+04 > >>>> Potential Kinetic En. Total Energy Temperature > >>>> Pres. DC (bar) > >>>> -3.02297e+06 6.15217e+05 -2.40775e+06 3.09312e+02 > >>>> -2.17704e+02 > >>>> Pressure (bar) Constr. rmsd > >>>> -3.39003e+01 3.10750e-05 > >>>> > >>>> DD step 149999 load imb.: force 60.8% > >>>> > >>>> Step Time Lambda > >>>> 150000 300.00000 0.00000 > >>>> > >>>> Energies (kJ/mol) > >>>> U-B Proper Dih. Improper Dih. CMAP > >>>> Dih. LJ-14 > >>>> 4.96380e+04 1.21010e+04 2.99986e+03 -1.51918e+03 > >>>> 1.97542e+04 > >>>> Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) > >>>> Coul. recip. > >>>> 2.54305e+05 4.06024e+05 -3.15801e+04 -3.75534e+06 > >>>> 2.24001e+04 > >>>> Potential Kinetic En. Total Energy Temperature > >>>> Pres. DC (bar) > >>>> -3.02121e+06 6.17009e+05 -2.40420e+06 3.10213e+02 > >>>> -2.17403e+02 > >>>> Pressure (bar) Constr. rmsd > >>>> -1.40623e+00 3.16495e-05 > >>>> > >>>> I think this high loading imbalance will affect more than 20% > >>>> of the performance but at least it will let the simulation > >>>> on. Therefore, the problem I would like to report is that > >>>> when running simulation with GPU-CPU hybrid with very few > >>>> GPU, the dynamic loading balance will cause domain > >>>> decomposition problems ( fail-to-allocate-memory ). I don't > >>>> know whether there is any solution to this problem currently > >>>> or anything could be improved? > >>>> > >>>> Yunlong > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> ======================================== > >>>> Yunlong Liu, PhD Candidate > >>>> Computational Biology and Biophysics > >>>> Department of Biophysics and Biophysical Chemistry > >>>> School of Medicine, The Johns Hopkins University > >>>> Email: yliu...@jhmi.edu <mailto:yliu...@jhmi.edu> > >>>> > >>>> Address: 725 N Wolfe St, WBSB RM 601, 21205 > >>>> ======================================== > >>>> > >>>> > >>>> > >>>> > >>>> -- ORNL/UT Center for Molecular Biophysics cmb.ornl.gov > >>>> <http://cmb.ornl.gov> > >>>> 865-241-1537 <tel:865-241-1537>, ORNL PO BOX 2008 MS6309 > >>> > >>> > >>> -- > >>> ======================================== > >>> Yunlong Liu, PhD Candidate > >>> Computational Biology and Biophysics > >>> Department of Biophysics and Biophysical Chemistry > >>> School of Medicine, The Johns Hopkins University > >>> Email: yliu...@jhmi.edu <mailto:yliu...@jhmi.edu> > >>> > >>> Address: 725 N Wolfe St, WBSB RM 601, 21205 > >>> ======================================== > >>> > >>> > >>> > >>> > >>> -- > >>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov < > http://cmb.ornl.gov> > >>> 865-241-1537 <tel:865-241-1537>, ORNL PO BOX 2008 MS6309 > >> > >> > >> -- > >> > >> ======================================== > >> Yunlong Liu, PhD Candidate > >> Computational Biology and Biophysics > >> Department of Biophysics and Biophysical Chemistry > >> School of Medicine, The Johns Hopkins University > >> Email: yliu...@jhmi.edu > >> Address: 725 N Wolfe St, WBSB RM 601, 21205 > >> ======================================== > >> > >> -- > >> Gromacs Users mailing list > >> > >> * Please search the archive at > >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > >> > >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >> > >> * For (un)subscribe requests visit > >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a > >> mail to gmx-users-requ...@gromacs.org. > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.