Hello Szilárd Páll Thank you for you reply. I tried your command:
gmx mdrun -ntmpi 7 -npme 1 -nb gpu -pme gpu -bonded gpu -gpuid 0,2,4,6 -gputask 001122334 but got the following error information: Using 7 MPI threads Using 10 OpenMP threads per tMPI thread Program: gmx mdrun, version 2019.3 Source file: src/gromacs/taskassignment/taskassignment.cpp (line 255) Function: std::vector<std::vector<gmx::GpuTaskMapping> >::value_type gmx::runTaskAssignment(const std::vector<int>&, const std::vector<int>&, const gmx_hw_info_t&, const gmx::MDLogger&, const t_commrec*, const gmx_multisim_t*, const gmx::PhysicalNodeCommunicator&, const std::vector<gmx::GpuTask>&, bool, PmeRunMode) MPI rank: 0 (out of 7) Inconsistency in user input: There were 7 GPU tasks found on node localhost.localdomain, but 4 GPUs were available. If the GPUs are equivalent, then it is usually best to have a number of tasks that is a multiple of the number of GPUs. You should reconsider your GPU task assignment, number of ranks, or your use of the -nb, -pme, and -npme options, perhaps after measuring the performance you can get. Could you tell me how to correct this? Best regards, Yeping ------------------------------------------------------------------ Hi, You have 2x Xeon Gold 6150 which is 2x 18 = 36 cores; Intel CPUs support 2 threads/core (HyperThreading), hence the 72. https://ark.intel.com/content/www/us/en/ark/products/120490/intel-xeon-gold-6150-processor-24-75m-cache-2-70-ghz.html You will not be able to scale efficiently over 8 GPUs in a single simulation with the current code; while performance will likely improve in the next release, due to PCI bus and PME scaling limitations, even with GROMACS 2020 it is unlikely you will see much benefit beyond 4 GPUs. Try running on 3-4 GPUs with at least 2 ranks on each, and one separate PME rank. You might also want to use every second GPU rather than the first four to avoid overloading the PCI bus; e.g. gmx mdrun -ntmpi 7 -npme 1 -nb gpu -pme gpu -bonded gpu -gpuid 0,2,4,6 -gputask 001122334 Cheers, -- Szilárd On Thu, Sep 5, 2019 at 1:12 AM 孙业平 <sunyeping at aliyun.com> wrote: >> Hello Mark Abraham, >> Thank you very much for your reply. I will definitely check the webinar and >> gromacs document. But now I am confused and expect an direct solution. The >> workstation should have 18 cores each with 4 hyperthreads. The output of >> "lscpu" reads: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 72 > On-line CPU(s) list: 0-71 > Thread(s) per core: 2 > Core(s) per socket: 18 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 85 > Model name: Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz > Stepping: 4 > CPU MHz: 2701.000 > CPU max MHz: 2701.0000 > CPU min MHz: 1200.0000 > BogoMIPS: 5400.00 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 1024K > L3 cache: 25344K > NUMA node0 CPU(s): 0-17,36-53 > NUMA node1 CPU(s): 18-35,54-71 >> Now I don't want to do multiple simulations and just want to run a single >> simulation. When assigning the simulation to only one GPU (gmx mdrun -v >> -gpu_id 0 -deffnm md), the simulation performance is 90 ns/day. However, >> when I don't assign the GPU but let all GPU work by: > gmx mdrun -v -deffnm md > The simulation performance is only 2 ns/day. >> So what is correct command to make a full use of all GPUs and achieve the >> best performance (which I expect should be much higher than 90 ns/day with >> only one GPU)? Could you give me further suggestions and help? >> Best regards, > Yeping ------------------------------------------------------------------ From:孙业平 <sunyep...@aliyun.com> Sent At:2019 Sep. 5 (Thu.) 07:12 To:gromacs <gmx-us...@gromacs.org>; Mark Abraham <mark.j.abra...@gmail.com> Cc:gromacs.org_gmx-users <gromacs.org_gmx-users@maillist.sys.kth.se> Subject:Re: [gmx-users] The problem of utilizing multiple GPU Hello Mark Abraham, Thank you very much for your reply. I will definitely check the webinar and gromacs document. But now I am confused and expect an direct solution. The workstation should have 18 cores each with 4 hyperthreads. The output of "lscpu" reads: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 72 On-line CPU(s) list: 0-71 Thread(s) per core: 2 Core(s) per socket: 18 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz Stepping: 4 CPU MHz: 2701.000 CPU max MHz: 2701.0000 CPU min MHz: 1200.0000 BogoMIPS: 5400.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 25344K NUMA node0 CPU(s): 0-17,36-53 NUMA node1 CPU(s): 18-35,54-71 Now I don't want to do multiple simulations and just want to run a single simulation. When assigning the simulation to only one GPU (gmx mdrun -v -gpu_id 0 -deffnm md), the simulation performance is 90 ns/day. However, when I don't assign the GPU but let all GPU work by: gmx mdrun -v -deffnm md The simulation performance is only 2 ns/day. So what is correct command to make a full use of all GPUs and achieve the best performance (which I expect should be much higher than 90 ns/day with only one GPU)? Could you give me further suggestions and help? Best regards, Yeping ------------------------------------------------------------------ From:Mark Abraham <mark.j.abra...@gmail.com> Sent At:2019 Sep. 4 (Wed.) 19:10 To:gromacs <gmx-us...@gromacs.org>; 孙业平 <sunyep...@aliyun.com> Cc:gromacs.org_gmx-users <gromacs.org_gmx-users@maillist.sys.kth.se> Subject:Re: [gmx-users] The problem of utilizing multiple GPU Hi, On Wed, 4 Sep 2019 at 12:54, sunyeping <sunyep...@aliyun.com> wrote: Dear everyone, I am trying to do simulation with a workstation with 72 core and 8 geforce 1080 GPUs. 72 cores, or just 36 cores each with two hyperthreads? (it matters because you might not want to share cores between simulations, which is what you'd get if you just assigned 9 hyperthreads per GPU and 1 GPU per simulation). When I do not assign a certain GPU with the command: gmx mdrun -v -deffnm md all GPUs are used and but the utilization of each GPU is extremely low (only 1-2 %), and the simulation will be finished after several months. Yep. Too many workers for not enough work means everyone spends time more time coordinating than working. This is likely to improve in GROMACS 2020 (beta out shortly). In contrast, when I assign the simulation task to only one GPU: gmx mdrun -v -gpu_id 0 -deffnm md the GPU utilization can reach 60-70%, and the simulation can be finished within a week. Even when I use only two GPU: Utilization is only a proxy - what you actually want to measure is the rate of simulation ie. ns/day. gmx mdrun -v -gpu_id 0,2 -deffnm md the GPU utilizations are very low and the simulation is very slow. That could be for a variety of reasons, which you could diagnose by looking at the performance report at the end of the log file, and comparing different runs. I think I may missuse the GPU for gromacs simulation. Could you tell me what is the correct way to use multiple GPUs? If you're happy running multiple simulations, then the easiest thing to do is to use the existing multi-simulation support to do mpirun -np 8 gmx_mpi -multidir dir0 dir1 dir2 ... dir7 and let mdrun handle the details. Otherwise you have to get involved in assigning a subset of the CPU cores and GPUs to each job that both runs fast and does not conflict. See the documentation for GROMACS for the version you're running e.g. http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html#running-mdrun-within-a-single-node. You probably want to check out this webinar tomorrow https://bioexcel.eu/webinar-more-bang-for-your-buck-improved-use-of-gpu-nodes-for-gromacs-2018-2019-09-05/. Mark Best regards -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.