Hi, thanks Szilard for the reply, below I include what i get to stdout/stderr and the mdp (log file is too long) obviously the timings are not accurate due to buffering but to me it looks as if Gromacs spent reproducibly about 2 minutes running on ONE thread, and only then switches to use 4. This switch happens at the time when it starts doing the test runs to establish the optimal pme grid. these test runs only take a couple of seconds (about 3000 steps = 6ps of a job which on this architecture does around 220 ns/d) from then on-wards everything is normal and as expected...
In the log file after job-start it gets quickly to the point where it says: "Will do PME sum in reciprocal space for electrostatic interactions." followed by some lit reference. then is stops there for about 2 minutes and only then carries on saying: Using a Gaussian width (1/beta) of 0.384195 nm for Ewald in these 2 minutes the job runs only on one thread, later it switches to 4. This is on a work station with Intel i7-3820@3.60GHz + one Nvidia 1060 GTX 6Gb my gmx is 2018.4 on a recently updated debian testing/buster ======================= stderr/stdout: job starts at 01:26:14 ======================= :-) GROMACS - gmx mdrun, 2018.4 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Aldert van Buuren Rudi van Drunen Anton Feenstra Gerrit Groenhof Aleksei Iupinov Christoph Junghans Anca Hamuraru Vincent Hindriksen Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Pieter Meulenhoff Erik Marklund Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Alexey Shvetsov Michael Shirts Alfons Sijbers Peter Tieleman Teemu Virolainen Christian Wennberg Maarten Wolf and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2017, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx mdrun, version 2018.4 Executable: /home/michael/local/gromacs-2018.4-bin-cuda10/bin/gmx Data prefix: /home/michael/local/gromacs-2018.4-bin-cuda10 Working dir: /home/michael/test/test-start-job-delay Command line: gmx mdrun -v -deffnm testd Reading file testd.tpr, VERSION 2018.4 (single precision) stops here until about 01:26:32 ======================================== Changing nstlist from 50 to 100, rlist from 1.2 to 1.236 Using 1 MPI thread Using 4 OpenMP threads 1 GPU auto-selected for this run. Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node: PP:0,PME:0 stops here until 01:28:46 ============================================== starting mdrun 'system' 1000000 steps, 2000.0 ps. ^Mstep 200: timed with pme grid 44 48 48, coulomb cutoff 1.200: 332.1 M-cycles ^Mstep 400: timed with pme grid 42 42 44, coulomb cutoff 1.303: 362.5 M-cycles stops here until 01:28:47 =========================================== ^Mstep 600: timed with pme grid 36 40 40, coulomb cutoff 1.456: 390.1 M-cycles ^Mstep 800: timed with pme grid 40 40 40, coulomb cutoff 1.394: 365.9 M-cycles ^Mstep 1000: timed with pme grid 40 40 42, coulomb cutoff 1.368: 355.7 M-cycles ^Mstep 1200: timed with pme grid 40 42 42, coulomb cutoff 1.328: 343.1 M-cycles ^Mstep 1400: timed with pme grid 40 42 44, coulomb cutoff 1.310: 340.4 M-cycles ^Mstep 1600: timed with pme grid 42 42 44, coulomb cutoff 1.303: 337.4 M-cycles ^Mstep 1800: timed with pme grid 42 44 44, coulomb cutoff 1.268: 325.7 M-cycles ^Mstep 2000: timed with pme grid 42 44 48, coulomb cutoff 1.248: 316.7 M-cycles ^Mstep 2200: timed with pme grid 44 48 48, coulomb cutoff 1.200: 295.9 M-cycles ^Mstep 2400: timed with pme grid 42 44 44, coulomb cutoff 1.268: 324.1 M-cycles ^Mstep 2600: timed with pme grid 42 44 48, coulomb cutoff 1.248: 317.3 M-cycles ^Mstep 2800: timed with pme grid 44 48 48, coulomb cutoff 1.200: 294.8 M-cycles ^M optimal pme grid 44 48 48, coulomb cutoff 1.200 stops here until 01:28:50 ========================================== mdp file: integrator = md dt = 0.002 nsteps = 1000000 comm-grps = System nstcomm = 50 ; nstxout = 25000 nstvout = 0 nstfout = 0 nstlog = 25000 nstenergy = 25000 ; nstlist = 50 ns_type = grid pbc = xyz rlist = 1.2 cutoff-scheme = Verlet ; coulombtype = PME rcoulomb = 1.2 vdw_type = cut-off rvdw = 1.2 ; constraints = h-bonds ; tcoupl = v-rescale tau-t = 0.1 ref-t = 313.0 tc-grps = System ; pcoupl = berendsen pcoupltype = anisotropic tau-p = 2.0 compressibility = 4.5e-5 4.5e-5 4.5e-5 0 0 0 ref-p = 1 1 1 0 0 0 -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.