[slurm-dev] RE: Amber + MVAPICH2 slower with SLURM vs PBS

Ryan Novosielski Wed, 11 Feb 2015 12:34:26 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So I've rebuilt MVAPICH2 2.0.1, with the flags for SLURM PMI. Then I
rebuilt AMBER 14. It solved one problem, which was that every time I
ran pmemd.cuda.MPI with srun, it would run all of the processes
against the same GPU card. That no longer happens and it uses
different cards. However, the CPU affinity thing is still a problem.
It will use the same CPU's for both jobs. This does not seem to be an
issue when running jobs via Torque, so I'm not really sure what is
happening. Could it be that it is relying on the Torque cpuset in
order to cause certain CPU's to be used and that MVAPICH2 is not
really doing it?


I'm actually not sure what mailing list this belongs on at this point.
It does seem as is this works right with Torque and not SLURM, which
would seem to implicate SLURM. But it seems that MVAPICH2 should be
making this happen and isn't. I guess if I had some pointers for where
to look here, I could figure out what's going on.

On 02/10/2015 09:24 PM, Novosielski, Ryan wrote:
> 
> So it certainly could be related to affinity. Here is the
> affinity-related output from the two PBS jobs:
> 
> run001: -------------CPU AFFINITY------------- RANK:0  CPU_SET:
> 0 RANK:1  CPU_SET:   1 ------------------------------------- 
> -------------CPU AFFINITY------------- RANK:0  CPU_SET:   0 RANK:1
> CPU_SET:   1 -------------------------------------
> 
> run002: -------------CPU AFFINITY------------- RANK:0  CPU_SET:
> 4 RANK:1  CPU_SET:   5 ------------------------------------- 
> -------------CPU AFFINITY------------- RANK:0  CPU_SET:   4 RANK:1
> CPU_SET:   5 -------------------------------------
> 
> Now the two SLURM jobs: run001: -------------CPU
> AFFINITY------------- RANK:0  CPU_SET:   0 RANK:1  CPU_SET:   1 
> -------------------------------------
> 
> run002: -------------CPU AFFINITY------------- RANK:0  CPU_SET:
> 0 RANK:1  CPU_SET:   1 -------------------------------------
> 
> Both jobs are running on an MVAPICH2 that was built without setting
> SLURM as the PMI. The jobs spawn the processes via MVAPICH2's
> mpiexec. I tried srun, but it seemed to run all of the jobs on the
> same GPU, so I was waiting to recompile MVAPICH2 and then AMBER
> using the SLURM PMI. I guess it's possible that will solve the
> problem, but this is still peculiar.
> 
> -- ____ *Note: UMDNJ is now Rutgers-Biomedical and Health
> Sciences* || \\UTGERS
> |---------------------*O*--------------------- ||_// Biomedical |
> Ryan Novosielski - Senior Technologist || \\ and Health |
> novos...@rutgers.edu - 973/972.0922 (2x0922) ||  \\  Sciences |
> OIRT/High Perf & Res Comp - MSB C630, Newark `' 
> ________________________________________ From: Jonathan Perkins
> [perki...@cse.ohio-state.edu] Sent: Tuesday, February 10, 2015 4:42
> PM To: slurm-dev Cc: Novosielski, Ryan Subject: [slurm-dev] Amber +
> MVAPICH2 slower with SLURM vs PBS
> 
> Do you have both environments available to do this comparision.  If
> so, is only SLURM vs Torque the only difference?
> 
> I do think that it'll be good to provide the output of the MPI job
> with those two variables that I mentioned in the earlier post.
> Maybe it will show a difference in affinity.  Otherwise it can be
> something else at play.
> 
> Between your two jobs with SLURM, did you did you only flip the
> Task Affinity setting?  It seems that affinity in MVAPICH2 was
> enabled in both runs so I would expect the second run to not
> perform so badly.
> 
> On Sat, Feb 07, 2015 at 10:48:31AM -0800, Novosielski, Ryan wrote:
>> So I turned off TaskAffinity (=none) and we ran two CUDA/GPU jobs
>> on one node. Apparently the performance with PBS/Torque is good
>> and with Slurm it is not. I'm confused as to why it would make
>> any difference:
>> 
>> Running 1 MPI job with slurm Gpu utilization 74-99% Cpu  — 4 CPU
>> cores, 0-3 64-84% utilization
>> 
>> Speed: 21ns/day vs previously reported 25.6ns/day with PBS
>> 
>> Submitting the second MPI job Gpu utilization down to 9-13%
>> (slightly better, it was 1-3% before [with TaskAffinity
>> enabled]) Cpu  — 4 CPU cores, the same 0-3 99%  utilization
>> 
>> Speed:  slow..   Okay, finally got it  1.45ns/day
>> 
>> Would that variable still be helpful to try?
>> 
>> We're using Slurm 14.11.3, MVAPICH2 2.0, Intel Compiler 15.0.1,
>> and AMBER 14 for these performance numbers. GPU's are M2090's I
>> think. I'd have to check that he wasn't using the K20's.
>> 
>> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences* 
>> || \\UTGERS      |---------------------*O*--------------------- 
>> ||_// Biomedical | Ryan Novosielski - Senior Technologist || \\
>> and Health | novos...@rutgers.edu<mailto:novos...@rutgers.edu>-
>> 973/972.0922 (2x0922) ||  \\  Sciences | OIRT/High Perf & Res
>> Comp - MSB C630, Newark `'
>> 
>> On Feb 7, 2015, at 12:34, Jonathan Perkins
>> <perki...@cse.ohio-state.edu<mailto:perki...@cse.ohio-state.edu>>
>> wrote:
>> 
>> 
>> Can you set MV2_SHOW_CPU_BINDING equal to 1 when running your
>> job?  This should show whether affinity is causing your processes
>> to be oversubscribed on a set of cores.
>> 
>> If this is the case you can disable the affinity from the library
>> by setting MV2_ENABLE_AFFINITY to 0.
>> 
>> On Fri, Feb 06, 2015 at 03:18:48PM -0800, Novosielski, Ryan
>> wrote: I am running into a similar problem, with Slurm 14.11.3
>> and MVAPICH2 2.0. I am wondering if perhaps having CPU affinity
>> configured in MVAPICH2 and Slurm at the same time isn't a bad
>> idea (I've also since realized that it uses cgroups and that the
>> 2.6.18 kernel in RHEL5 does not support it anyway -- but it
>> didn't seem to be harming anything. Maybe it was?).
>> 
>> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences* 
>> || \\UTGERS      |---------------------*O*--------------------- 
>> ||_// Biomedical | Ryan Novosielski - Senior Technologist || \\
>> and Health |
>> novos...@rutgers.edu<mailto:novos...@rutgers.edu><mailto:novos...@rutgers.edu>-
>> 973/972.0922 (2x0922) ||  \\  Sciences | OIRT/High Perf & Res
>> Comp - MSB C630, Newark `'
> 
> -- Jonathan Perkins
> 

- -- 
____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
|| \\UTGERS      |---------------------*O*---------------------
||_// Biomedical | Ryan Novosielski - Senior Technologist
|| \\ and Health | novos...@rutgers.edu - 973/972.0922 (2x0922)
||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
     `'
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlTbuiwACgkQmb+gadEcsb4HQwCdG8nMs/Qxt2JqVpyBtqoS1IvQ
UScAoNpGLd3AhH0Zkyh0J0XRIFwg66FN
=HGsw
-----END PGP SIGNATURE-----

[slurm-dev] RE: Amber + MVAPICH2 slower with SLURM vs PBS

Reply via email to