On Fri, Jul 18, 2014 at 8:25 PM, Yunlong Liu <yliu...@jhmi.edu> wrote: > Hi Mark, > > I post up my log file for the run here. Thank you. > > Log file opened on Wed Jul 16 11:26:51 2014 > Host: c442-403.stampede.tacc.utexas.edu pid: 31032 nodeid: 0 nnodes: 4 > GROMACS: mdrun_mpi_gpu, VERSION 5.0-rc1 > > GROMACS is written by: > Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar > Aldert van Buuren Rudi van Drunen Anton Feenstra Sebastian Fritsch > Gerrit Groenhof Christoph Junghans Peter Kasson Carsten Kutzner > Per Larsson Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff > Erik Marklund Teemu Murtola Szilard Pall Sander Pronk > Roland Schulz Alexey Shvetsov Michael Shirts Alfons Sijbers > Peter Tieleman Christian Wennberg Maarten Wolf > and the project leaders: > Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel > > Copyright (c) 1991-2000, University of Groningen, The Netherlands. > Copyright (c) 2001-2014, The GROMACS development team at > Uppsala University, Stockholm University and > the Royal Institute of Technology, Sweden. > check out http://www.gromacs.org for more information. > > GROMACS is free software; you can redistribute it and/or modify it > under the terms of the GNU Lesser General Public License > as published by the Free Software Foundation; either version 2.1 > of the License, or (at your option) any later version. > > GROMACS: mdrun_mpi_gpu, VERSION 5.0-rc1 > Executable: /work/03002/yliu120/gromacs-5.0/mv2_mkl/bin/mdrun_mpi_gpu > Library dir: /work/03002/yliu120/gromacs-5.0/mv2_mkl/share/gromacs/top > Command line: > mdrun_mpi_gpu -pin on -ntomp 8 -deffnm pi3k-wt-1 -gpu_id 00 > > Gromacs version: VERSION 5.0-rc1 > Precision: single > Memory model: 64 bit > MPI library: MPI > OpenMP support: enabled > GPU support: enabled > invsqrt routine: gmx_software_invsqrt(x) > SIMD instructions: AVX_256 > FFT library: Intel MKL > RDTSCP usage: enabled > C++11 compilation: disabled > TNG support: enabled > Tracing support: disabled > Built on: Wed Jun 4 13:59:17 CDT 2014 > Built by: xzhu...@login1.stampede.tacc.utexas.edu [CMAKE] > Build OS/arch: Linux 2.6.32-358.18.1.el6.x86_64 x86_64 > Build CPU vendor: GenuineIntel > Build CPU brand: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz > Build CPU family: 6 Model: 45 Stepping: 7 > Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr > nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 > sse4.2 ssse3 tdt x2apic > C compiler: /opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icc > Intel 13.1.0.20130121 > C compiler flags: -mavx -fno-strict-aliasing -mkl=sequential -std=gnu99 > -w3 -wd111 -wd177 -wd181 -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 > -wd593 -wd869 -wd981 -wd1418 -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 > -wd2557 -wd3280 -wd3346 -ip -funroll-all-loops -alias-const -ansi-alias > -O3 -DNDEBUG > C++ compiler: > /opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icpc Intel > 13.1.0.20130121 > C++ compiler flags: -mavx -fno-strict-aliasing -w3 -wd111 -wd177 -wd181 > -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 -wd593 -wd869 -wd981 -wd1418 > -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 -wd2557 -wd3280 -wd3346 > -wd1782 -ip -funroll-all-loops -alias-const -ansi-alias -O3 -DNDEBUG > Boost version: 1.51.0 (external) > CUDA compiler: /opt/apps/cuda/5.0/bin/nvcc nvcc: NVIDIA (R) Cuda > compiler driver;Copyright (c) 2005-2012 NVIDIA Corporation;Built on > Fri_Sep_21_17:28:58_PDT_2012;Cuda compilation tools, release 5.0, V0.2.1221 > CUDA compiler > flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;-ccbin=/opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icc;;-Xcompiler;-gcc-version=450;; > > ;-mavx;-fno-strict-aliasing;-w3;-wd111;-wd177;-wd181;-wd193;-wd271;-wd304;-wd383;-wd424;-wd444;-wd522;-wd593;-wd869;-wd981;-wd1418;-wd1419;-wd1572;-wd1599;-wd2259;-wd2415;-wd2547;-wd2557;-wd3280;-wd3346;-wd1782;-ip;-funroll-all-loops;-alias-const;-ansi-alias;-O3;-DNDEBUG > CUDA driver: 5.50 > CUDA runtime: 5.0
Tips: you will get better performance if you use: CUDA 5.5, gcc 4.8, and fftw. The difference in total performance will depend on your setup and could be anywhere between 0-15%. > > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl > GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable > molecular simulation > J. Chem. Theory Comput. 4 (2008) pp. 435-447 > -------- -------- --- Thank You --- -------- -------- > > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C. > Berendsen > GROMACS: Fast, Flexible and Free > J. Comp. Chem. 26 (2005) pp. 1701-1719 > -------- -------- --- Thank You --- -------- -------- > > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > E. Lindahl and B. Hess and D. van der Spoel > GROMACS 3.0: A package for molecular simulation and trajectory analysis > J. Mol. Mod. 7 (2001) pp. 306-317 > -------- -------- --- Thank You --- -------- -------- > > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > H. J. C. Berendsen, D. van der Spoel and R. van Drunen > GROMACS: A message-passing parallel molecular dynamics implementation > Comp. Phys. Comm. 91 (1995) pp. 43-56 > -------- -------- --- Thank You --- -------- -------- > > > Number of CPUs detected (16) does not match the number reported by OpenMP (1). > Consider setting the launch configuration manually! Something is still not right here. This message means that the OpenMP library reports that there is *one* core available (through omp_get_num_procs). Please consult your job scheduler's documentation because this could affect performance (my guess is that it doesn't). > > For optimal performance with a GPU nstlist (now 5) should be larger. > The optimum depends on your CPU and GPU resources. > You might want to try several nstlist values. > Changing nstlist from 5 to 40, rlist from 1 to 1.093 > > Input Parameters: > integrator = md > nsteps = 5000000 > init-step = 0 > cutoff-scheme = Verlet > ns-type = Grid > nstlist = 40 > ndelta = 2 > nstcomm = 100 > comm-mode = Linear > nstlog = 10000 > nstxout = 10000 > nstvout = 10000 > nstfout = 0 > nstcalcenergy = 100 > nstenergy = 10000 > nstxout-compressed = 10000 > init-t = 0 > delta-t = 0.002 > x-compression-precision = 1000 > fourierspacing = 0.16 > nkx = 96 > nky = 96 > nkz = 96 > pme-order = 4 > ewald-rtol = 1e-05 > ewald-rtol-lj = 0.001 > ewald-geometry = 0 > epsilon-surface = 0 > optimize-fft = FALSE > lj-pme-comb-rule = Geometric > ePBC = xyz > bPeriodicMols = FALSE > bContinuation = TRUE > bShakeSOR = FALSE > etc = V-rescale > bPrintNHChains = FALSE > nsttcouple = 5 > epc = Parrinello-Rahman > epctype = Isotropic > nstpcouple = 5 > tau-p = 2 > ref-p (3x3): > ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00} > ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00} > ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00} > compress (3x3): > compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00} > compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00} > compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05} > refcoord-scaling = No > posres-com (3): > posres-com[0]= 0.00000e+00 > posres-com[1]= 0.00000e+00 > posres-com[2]= 0.00000e+00 > posres-comB (3): > posres-comB[0]= 0.00000e+00 > posres-comB[1]= 0.00000e+00 > posres-comB[2]= 0.00000e+00 > verlet-buffer-tolerance = 0.005 > rlist = 1.093 > rlistlong = 1.093 > nstcalclr = 5 > rtpi = 0.05 > coulombtype = PME > coulomb-modifier = Potential-shift > rcoulomb-switch = 0 > rcoulomb = 1 > vdwtype = Cut-off > vdw-modifier = Potential-shift > rvdw-switch = 0 > rvdw = 1 > epsilon-r = 1 > epsilon-rf = inf > tabext = 1 > implicit-solvent = No > gb-algorithm = Still > gb-epsilon-solvent = 80 > nstgbradii = 1 > rgbradii = 1 > gb-saltconc = 0 > gb-obc-alpha = 1 > gb-obc-beta = 0.8 > gb-obc-gamma = 4.85 > gb-dielectric-offset = 0.009 > sa-algorithm = Ace-approximation > sa-surface-tension = 2.05016 > DispCorr = EnerPres > bSimTemp = FALSE > free-energy = no > nwall = 0 > wall-type = 9-3 > wall-atomtype[0] = -1 > wall-atomtype[1] = -1 > wall-density[0] = 0 > wall-density[1] = 0 > wall-ewald-zfac = 3 > pull = no > rotation = FALSE > interactiveMD = FALSE > disre = No > disre-weighting = Conservative > disre-mixed = FALSE > dr-fc = 1000 > dr-tau = 0 > nstdisreout = 100 > orires-fc = 0 > orires-tau = 0 > nstorireout = 100 > dihre-fc = 0 > em-stepsize = 0.01 > em-tol = 10 > niter = 20 > fc-stepsize = 0 > nstcgsteep = 1000 > nbfgscorr = 10 > ConstAlg = Lincs > shake-tol = 0.0001 > lincs-order = 4 > lincs-warnangle = 30 > lincs-iter = 1 > bd-fric = 0 > ld-seed = 645545913 > cos-accel = 0 > deform (3x3): > deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00} > deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00} > deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00} > adress = FALSE > userint1 = 0 > userint2 = 0 > userint3 = 0 > userint4 = 0 > userreal1 = 0 > userreal2 = 0 > userreal3 = 0 > userreal4 = 0 > grpopts: > nrdf: 42998.7 429867 > ref-t: 310 310 > tau-t: 0.1 0.1 > anneal: No No > ann-npoints: 0 0 > acc: 0 0 0 > nfreeze: N N N > energygrp-flags[ 0]: 0 > efield-x: > n = 0 > efield-xt: > n = 0 > efield-y: > n = 0 > efield-yt: > n = 0 > efield-z: > n = 0 > efield-zt: > n = 0 > eSwapCoords = no > bQMMM = FALSE > QMconstraints = 0 > QMMMscheme = 0 > scalefactor = 1 > qm-opts: > ngQM = 0 > > Initializing Domain Decomposition on 4 nodes > Dynamic load balancing: auto > Will sort the charge groups at every domain (re)decomposition > Initial maximum inter charge-group distances: > two-body bonded interactions: 0.429 nm, LJ-14, atoms 13175 13183 > multi-body bonded interactions: 0.489 nm, CMAP Dih., atoms 18312 18321 > Minimum cell size due to bonded interactions: 0.537 nm > Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.819 nm > Estimated maximum distance required for P-LINCS: 0.819 nm > This distance will limit the DD cell size, you can override this with -rcon > Using 0 separate PME nodes, as there are too few total > nodes for efficient splitting > Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25 > Optimizing the DD grid for 4 cells with a minimum initial size of 1.024 nm > The maximum allowed number of cells is: X 11 Y 11 Z 10 > Domain decomposition grid 4 x 1 x 1, separate PME nodes 0 > PME domain decomposition: 4 x 1 x 1 > Domain decomposition nodeid 0, coordinates 0 0 0 > > Using two step summing over 2 groups of on average 2.0 processes > > Using 4 MPI processes > Using 8 OpenMP threads per MPI process Try running with 4 ranks and 4 threads which avoids using Hyperthreading and will probably improve performance. > Detecting CPU SIMD instructions. > Present hardware specification: > Vendor: GenuineIntel > Brand: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz > Family: 6 Model: 45 Stepping: 7 > Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc > pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 > tdt x2apic > SIMD instructions most likely to fit this hardware: AVX_256 > SIMD instructions selected at GROMACS compile time: AVX_256 > > > 1 GPU detected on host c442-403.stampede.tacc.utexas.edu: > #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible > > 1 GPU user-selected for this run. > Mapping of GPUs to the 2 PP ranks in this node: #0, #0 > > NOTE: You assigned a GPU to multiple MPI processes. > Will do PME sum in reciprocal space for electrostatic interactions. > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen > A smooth particle mesh Ewald method > J. Chem. Phys. 103 (1995) pp. 8577-8592 > -------- -------- --- Thank You --- -------- -------- > > Will do ordinary reciprocal space Ewald sum. > Using a Gaussian width (1/beta) of 0.320163 nm for Ewald > Cut-off's: NS: 1.093 Coulomb: 1 LJ: 1 > Long Range LJ corr.: <C6> 3.1875e-04 > System total charge: 1.000 > Generated table with 1046 data points for Ewald. > Tabscale = 500 points/nm > Generated table with 1046 data points for LJ6. > Tabscale = 500 points/nm > Generated table with 1046 data points for LJ12. > Tabscale = 500 points/nm > Generated table with 1046 data points for 1-4 COUL. > Tabscale = 500 points/nm > Generated table with 1046 data points for 1-4 LJ6. > Tabscale = 500 points/nm > Generated table with 1046 data points for 1-4 LJ12. > Tabscale = 500 points/nm > > Using CUDA 8x8 non-bonded kernels > > Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e+00, Ewald -1.000e-05 > Initialized non-bonded Ewald correction tables, spacing: 6.52e-04 size: 1536 > > > Overriding thread affinity set outside mdrun_mpi_gpu > > Pinning threads with an auto-selected logical core stride of 1 > > Initializing Parallel LINear Constraint Solver > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > B. Hess > P-LINCS: A Parallel Linear Constraint Solver for molecular simulation > J. Chem. Theory Comput. 4 (2008) pp. 116-122 > -------- -------- --- Thank You --- -------- -------- > > The number of constraints is 21852 > There are inter charge-group constraints, > will communicate selected coordinates each lincs iteration > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > S. Miyamoto and P. A. Kollman > SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid > Water Models > J. Comp. Chem. 13 (1992) pp. 952-962 > -------- -------- --- Thank You --- -------- -------- > > > Linking all bonded interactions to atoms > There are 333337 inter charge-group exclusions, > will use an extra communication step for exclusion forces for PME > > The initial number of communication pulses is: X 1 > The initial domain decomposition cell size is: X 3.06 nm > > The maximum allowed distance for charge groups involved in interactions is: > non-bonded interactions 1.093 nm > (the following are initial values, they could change due to box deformation) > two-body bonded interactions (-rdd) 1.093 nm > multi-body bonded interactions (-rdd) 1.093 nm > atoms separated by up to 5 constraints (-rcon) 3.061 nm > > When dynamic load balancing gets turned on, these settings will change to: > The maximum number of communication pulses is: X 1 > The minimum size for domain decomposition cells is 1.093 nm > The requested allowed shrink of DD cells (option -dds) is: 0.80 > The allowed shrink of domain decomposition cells is: X 0.36 > The maximum allowed distance for charge groups involved in interactions is: > non-bonded interactions 1.093 nm > two-body bonded interactions (-rdd) 1.093 nm > multi-body bonded interactions (-rdd) 1.093 nm > atoms separated by up to 5 constraints (-rcon) 1.093 nm > > > Making 1D domain decomposition grid 4 x 1 x 1, home cell index 0 0 0 > > Center of mass motion removal mode is Linear > We have the following groups for center of mass motion removal: > 0: rest > > ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ > G. Bussi, D. Donadio and M. Parrinello > Canonical sampling through velocity rescaling > J. Chem. Phys. 126 (2007) pp. 014101 > -------- -------- --- Thank You --- -------- -------- > > There are: 236549 Atoms > Charge group distribution at step 0: 58642 59637 59750 58520 > Initial temperature: 310.644 K > > Started mdrun on node 0 Wed Jul 16 11:26:55 2014 > Step Time Lambda > 0 0.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.07365e+04 2.95121e+04 3.01332e+03 -7.32021e+03 1.97198e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01566e+05 4.01053e+05 -3.13304e+04 -3.70280e+06 2.22526e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01359e+06 6.13614e+05 -2.39998e+06 3.12141e+02 -2.18173e+02 > Pressure (bar) Constr. rmsd > -3.47613e+01 3.40465e-05 > > DD step 39 load imb.: force 16.2% > > step 80: timed with pme grid 96 96 96, coulomb cutoff 1.000: 1156.6 M-cycles > step 160: timed with pme grid 80 80 80, coulomb cutoff 1.172: 1547.8 M-cycles > step 240: timed with pme grid 96 96 96, coulomb cutoff 1.000: 1151.5 M-cycles > step 320: timed with pme grid 84 84 84, coulomb cutoff 1.116: 1385.3 M-cycles > optimal pme grid 96 96 96, coulomb cutoff 1.000 > DD step 9999 load imb.: force 12.0% > > Step Time Lambda > 10000 20.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.04795e+04 2.99195e+04 2.92288e+03 -7.24773e+03 2.00091e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01757e+05 4.01852e+05 -3.13182e+04 -3.70354e+06 2.21148e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01305e+06 6.07693e+05 -2.40536e+06 3.09129e+02 -2.18004e+02 > Pressure (bar) Constr. rmsd > 4.97661e+01 3.21278e-05 > > DD step 19999 load imb.: force 12.9% > > Step Time Lambda > 20000 40.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.03101e+04 2.98583e+04 2.92319e+03 -7.24572e+03 1.98643e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02386e+05 4.04043e+05 -3.13265e+04 -3.70922e+06 2.22318e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01618e+06 6.10725e+05 -2.40545e+06 3.10671e+02 -2.18119e+02 > Pressure (bar) Constr. rmsd > 5.25345e+01 3.19849e-05 > > DD step 29999 load imb.: force 12.9% > > Step Time Lambda > 30000 60.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.04086e+04 2.98208e+04 2.96232e+03 -7.36511e+03 1.97707e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02219e+05 4.04588e+05 -3.13898e+04 -3.70933e+06 2.18899e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01643e+06 6.09712e+05 -2.40671e+06 3.10156e+02 -2.19002e+02 > Pressure (bar) Constr. rmsd > 1.69563e+01 3.28362e-05 > > DD step 39999 load imb.: force 13.6% > > Step Time Lambda > 40000 80.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.08454e+04 2.97168e+04 2.88905e+03 -7.38898e+03 1.97906e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01724e+05 4.00249e+05 -3.13130e+04 -3.70497e+06 2.19180e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01654e+06 6.09973e+05 -2.40657e+06 3.10288e+02 -2.17931e+02 > Pressure (bar) Constr. rmsd > -4.75665e+01 3.26365e-05 > > DD step 49999 load imb.: force 15.1% > > Step Time Lambda > 50000 100.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.05524e+04 2.96917e+04 2.93777e+03 -7.29600e+03 1.98992e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01154e+05 4.02695e+05 -3.12880e+04 -3.70590e+06 2.18002e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01576e+06 6.07954e+05 -2.40780e+06 3.09262e+02 -2.17584e+02 > Pressure (bar) Constr. rmsd > -7.76501e+00 3.21618e-05 > > DD step 59999 load imb.: force 12.5% > > Step Time Lambda > 60000 120.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.00781e+04 3.00110e+04 3.14274e+03 -7.37195e+03 2.00457e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02606e+05 4.01699e+05 -3.13388e+04 -3.70509e+06 2.19611e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01426e+06 6.09684e+05 -2.40458e+06 3.10142e+02 -2.18291e+02 > Pressure (bar) Constr. rmsd > -3.63586e-01 3.19734e-05 > > DD step 69999 load imb.: force 11.2% > > Step Time Lambda > 70000 140.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.08759e+04 2.99581e+04 2.98187e+03 -7.47015e+03 1.97981e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02357e+05 4.01634e+05 -3.13895e+04 -3.70518e+06 2.19320e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01450e+06 6.10453e+05 -2.40404e+06 3.10533e+02 -2.18998e+02 > Pressure (bar) Constr. rmsd > -5.49562e+01 3.32050e-05 > > DD step 79999 load imb.: force 12.7% > > Step Time Lambda > 80000 160.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.02476e+04 2.99203e+04 3.01794e+03 -7.41418e+03 1.99012e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02636e+05 3.99866e+05 -3.13105e+04 -3.70666e+06 2.17894e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01801e+06 6.10188e+05 -2.40782e+06 3.10398e+02 -2.17897e+02 > Pressure (bar) Constr. rmsd > -7.82256e+01 3.22503e-05 > > Writing checkpoint, step 84280 at Wed Jul 16 11:41:55 2014 > > > DD step 89999 load imb.: force 9.5% > > Step Time Lambda > 90000 180.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.04253e+04 3.01029e+04 3.04981e+03 -7.29947e+03 1.98989e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02004e+05 4.03726e+05 -3.12806e+04 -3.70717e+06 2.20550e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01449e+06 6.08805e+05 -2.40568e+06 3.09694e+02 -2.17480e+02 > Pressure (bar) Constr. rmsd > 2.29629e+01 3.23359e-05 > > DD step 99999 load imb.: force 11.4% > > Step Time Lambda > 100000 200.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.05809e+04 2.97365e+04 2.90575e+03 -7.46760e+03 2.00142e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02442e+05 4.02628e+05 -3.13276e+04 -3.70909e+06 2.19456e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01763e+06 6.09703e+05 -2.40793e+06 3.10151e+02 -2.18135e+02 > Pressure (bar) Constr. rmsd > 2.61670e+01 3.23152e-05 > > DD step 109999 load imb.: force 11.5% > > Step Time Lambda > 110000 220.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.04489e+04 2.98261e+04 2.96408e+03 -7.46597e+03 1.99103e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01929e+05 4.04057e+05 -3.13158e+04 -3.70812e+06 2.21537e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01561e+06 6.09714e+05 -2.40590e+06 3.10157e+02 -2.17970e+02 > Pressure (bar) Constr. rmsd > 3.75535e+01 3.23884e-05 > > DD step 119999 load imb.: force 13.4% > > Step Time Lambda > 120000 240.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.02048e+04 2.96834e+04 2.99140e+03 -7.47253e+03 1.98509e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02924e+05 4.00695e+05 -3.13677e+04 -3.70737e+06 2.19556e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01790e+06 6.09085e+05 -2.40882e+06 3.09837e+02 -2.18693e+02 > Pressure (bar) Constr. rmsd > -4.17847e+01 3.24539e-05 > > DD step 129999 load imb.: force 13.9% > > Step Time Lambda > 130000 260.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 4.99271e+04 2.98272e+04 2.93917e+03 -7.27635e+03 1.98999e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02518e+05 3.97726e+05 -3.13026e+04 -3.70217e+06 2.20807e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01583e+06 6.09694e+05 -2.40614e+06 3.10147e+02 -2.17788e+02 > Pressure (bar) Constr. rmsd > -1.15382e+02 3.19481e-05 > > DD step 139999 load imb.: force 10.1% > > Step Time Lambda > 140000 280.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.03356e+04 2.97673e+04 2.87730e+03 -7.47602e+03 1.97827e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02578e+05 4.02705e+05 -3.12475e+04 -3.70659e+06 2.19426e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01532e+06 6.06119e+05 -2.40920e+06 3.08328e+02 -2.17021e+02 > Pressure (bar) Constr. rmsd > -4.87389e+01 3.29943e-05 > > DD step 149999 load imb.: force 12.0% > > Step Time Lambda > 150000 300.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.01586e+04 2.98972e+04 2.97697e+03 -7.38809e+03 1.99546e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02136e+05 4.04987e+05 -3.13378e+04 -3.71164e+06 2.18617e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01840e+06 6.08594e+05 -2.40980e+06 3.09587e+02 -2.18277e+02 > Pressure (bar) Constr. rmsd > 4.10102e+01 3.20743e-05 > > DD step 159999 load imb.: force 11.6% > > Step Time Lambda > 160000 320.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.02983e+04 2.99181e+04 2.94672e+03 -7.49915e+03 1.99437e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.03281e+05 4.00190e+05 -3.12908e+04 -3.70493e+06 2.23782e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01477e+06 6.09016e+05 -2.40575e+06 3.09802e+02 -2.17623e+02 > Pressure (bar) Constr. rmsd > -6.39499e+01 3.24603e-05 > > Writing checkpoint, step 168560 at Wed Jul 16 11:56:56 2014 > > > DD step 169999 load imb.: force 12.3% > > Step Time Lambda > 170000 340.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 4.96860e+04 2.97866e+04 2.87484e+03 -7.42084e+03 1.98346e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.03200e+05 4.05459e+05 -3.13545e+04 -3.71085e+06 2.20106e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01677e+06 6.08386e+05 -2.40839e+06 3.09481e+02 -2.18509e+02 > Pressure (bar) Constr. rmsd > 7.35129e+01 3.22101e-05 > > DD step 179999 load imb.: force 12.5% >probably > Step Time Lambda > 180000 360.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.01938e+04 2.98005e+04 3.12308e+03 -7.44680e+03 1.97367e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.02109e+05 4.02954e+05 -3.12995e+04 -3.70869e+06 2.20544e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01747e+06 6.08815e+05 -2.40865e+06 3.09700e+02 -2.17744e+02 > Pressure (bar) Constr. rmsd > -1.15302e+01 3.22109e-05 > > DD step 189999 load imb.: force 13.4% > > Step Time Lambda > 190000 380.00000 0.00000 > > Energies (kJ/mol) > U-B Proper Dih. Improper Dih. CMAP Dih. LJ-14 > 5.07811e+04 2.99541e+04 2.98628e+03 -7.39091e+03 1.97456e+04 > Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. > 2.01761e+05 4.03922e+05 -3.13225e+04 -3.71005e+06 2.19542e+04 > Potential Kinetic En. Total Energy Temperature Pres. DC (bar) > -3.01766e+06 6.10133e+05 -2.40753e+06 3.10370e+02 -2.18064e+02 > Pressure (bar) Constr. rmsd > -2.96181e+01 3.28160e-05 > > If you want to see the full log file, please give me an email address that I > could send it to. Pastebin? Cheers, Sz. > Thank you. > Yunlong > > ________________________________________ > 发件人: gromacs.org_gmx-users-boun...@maillist.sys.kth.se > <gromacs.org_gmx-users-boun...@maillist.sys.kth.se> 代表 Mark Abraham > <mark.j.abra...@gmail.com> > 发送时间: 2014年7月18日 23:52 > 收件人: Discussion list for GROMACS users > 主题: Re: [gmx-users] Can't allocate memory problem > > Hi, > > That's highly unusual, and suggests you are doing something highly unusual, > like trying to run on huge numbers of threads, or very large numbers of > bonded interactions. How are you setting up to call mdrun, and what is in > your tpr? > > Mark > On Jul 17, 2014 10:13 PM, "Yunlong Liu" <yliu...@jhmi.edu> wrote: > >> Hi, >> >> >> I am currently experiencing a "Can't allocate memory" problem on Gromacs >> 4.6.5 with GPU acceleration. >> >> Actually, I am running my simulations on Stampede/TACC supercomputers with >> their GPU queue. My first experience is when the simulation length longer >> than 10 ns, the system starts to throw out the "Can't allocate memory" >> problem as follows: >> >> >> Fatal error: >> Not enough memory. Failed to realloc 1403808 bytes for f_t->f, >> f_t->f=0xa912a010 >> (called from file >> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/bondfree.c, >> line 3840) >> For more information and tips for troubleshooting, please check the GROMACS >> website at http://www.gromacs.org/Documentation/Errors >> ------------------------------------------------------- >> >> "These Gromacs Guys Really Rock" (P.J. Meulenhoff) >> : Cannot allocate memory >> Error on node 0, will try to stop all the nodes >> Halting parallel program mdrun_mpi_gpu on CPU 0 out of 4 >> >> ------------------------------------------------------- >> Program mdrun_mpi_gpu, VERSION 4.6.5 >> Source code file: >> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/smalloc.c, >> line: 241 >> >> Fatal error: >> Not enough memory. Failed to realloc 1403808 bytes for f_t->f, >> f_t->f=0xaa516e90 >> (called from file >> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/bondfree.c, >> line 3840) >> For more information and tips for troubleshooting, please check the GROMACS >> website at http://www.gromacs.org/Documentation/Errors >> ------------------------------------------------------- >> >> Recently, this error occurs even I run a short NVT equilibrium. This >> problem also exists when I use Gromacs 5.0 with GPU acceleration. I looked >> up the Gromacs errors website to check the reasons for this. But it seems >> that none of those reasons will fit in this situation. I use a very good >> computer, the Stampede and I run short simulations. And I know gromacs use >> nanometers as unit. I tried all the solutions that I can figure out but the >> problem becomes more severe. >> >> Is there anybody that has an idea on solving this issue? >> >> Thank you. >> >> Yunlong >> >> >> >> >> >> >> >> >> Davis Yunlong Liu >> >> BCMB - Second Year PhD Candidate >> >> School of Medicine >> >> The Johns Hopkins University >> >> E-mail: yliu...@jhmi.edu<mailto:yliu...@jhmi.edu> >> -- >> Gromacs Users mailing list >> >> * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >> posting! >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >> * For (un)subscribe requests visit >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or >> send a mail to gmx-users-requ...@gromacs.org. >> > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.