On 01/06/16 11:53, Ralph Castain wrote:
On Jan 6, 2016, at 9:53 AM, Bruce Roberts <[email protected]
<mailto:[email protected]>> wrote:
PMIx sounds really nice.
Forgive my naive question, but for mpirun would sstat and step
accounting continue to work as it does when using srun?
It does to an extent. You generally execute an mpirun for each job
step. The mpirun launches its own daemons for each invocation, and the
app procs are children of these daemons. So Slurm sees the daemons and
will aggregate the accounting for its children into the daemon’s
usage. However, the daemon mostly just sleeps once the app is running,
and so the accounting should be okay (though you won’t get it for each
individual app process).
Perhaps others out there who have used this can chime in with their
experience?
So it sounds like mpirun must use srun under the covers to launch the
daemons. If using the cgroups proctrack plugin I'm guessing accounting
will work for the entire step just not down to the individual ranks as
you state. sstat probably isn't that useful at that point. That is a
large difference.
Does mpirun also support Slurm's task
placement/layout/binding/signaling? Our users use most of the
features quite heavily as I am guessing others do as well.
What mpirun supports depends on the MPI implementation, so I can only
address your question for OpenMPI. You’ll find that OMPI’s mpirun
provides a superset of Slurms options (i.e., we implemented a broader
level of support), but the names and syntax of those options is
different as it reflects that broader support. For example, we have
the ability to allow more fine-grained layout/binding patterns and
combinations.
That is interesting. I am able to lay out tasks in any order I would
like with srun on cores or threads of cores. What finer-grained
layout/binding patterns are you referring to?
Thanks for your insights so far they helpful!
Thanks!
On 01/06/16 07:54, Ralph Castain wrote:
As with all such rumors, there is some truth and some inaccuracies
to it. Note that the various MPIs have historically differed
significantly in how they implement mpirun, though the differences
in terms of behavior and performance have been closing. So it is
hard to provide a clearcut answer that spans time, and I’ll just
report where we are now and looking ahead a bit.
PMI-1 support doesn't scale as well as what was done in mpirun from
some of the MPI libraries, and so your (A) is certainly true.
Remember that Slurm provides PMI-1 out-of-the-box and that you have
to do a second build step to add PMI-2 support. So for people that
just do the std install and run, this will be the expected situation.
For those that install PMI-2 (or the new extended PMI-2 for
MVAPICH), you’ll see some improved performance. I suspect you’ll
find that srun and mpirun are pretty close to each other at that
point, and the choice really just comes down to your desired cmd
line options.
The test results with PMIx indicate that the performance gap between
direct (srun) launch and indirect (mpirun) launch is pretty much
gone. You have to remember that the overhead of mapping the job
isn’t very large (and the time is roughly equal anyway), and that
both srun and mpirun distribute the launch cmd in the same way (via
a tree-based algorithm). Likewise, both involve starting a
user-level daemon and wiring those up.
So when you break down the steps, and given that mpirun and srun are
using the same wireup support, you can see that the two should be
equivalent. Really just a question of which cmd line options you prefer.
HTH
Ralph
On Jan 6, 2016, at 6:03 AM, Novosielski, Ryan
<[email protected] <mailto:[email protected]>> wrote:
Since this is an audience that might know, and this is related (but
off-topic, sorry): is there any truth to the suggestions on the
Internet that using srun is /slower/ than mpirun/mpiexec? There
were some old mailing list messages someplace that seem to indicate
A) yes, in the old days of PMI1 only or B) likely it was a
misconfigured system in the first place. I haven't found anything
definitive though and those threads sort of petered out without an
answer.
____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
|| \\UTGERS <smb://UTGERS>
|---------------------*O*---------------------
||_// Biomedical | Ryan Novosielski - Senior Technologist
|| \\ and Health | [email protected]
<mailto:[email protected]>- 973/972.0922 (2x0922)
|| \\ Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
`'
On Jan 6, 2016, at 01:43, Ralph Castain <[email protected]
<mailto:[email protected]>> wrote:
Simple reason, Chris - the PMI support is GPL 2.0, and so anything
built against it automatically becomes GPL. So OpenHPC cannot
distribute Slurm with those libraries.
Instead, we are looking to use the new PMIx library to provide
wireup support, which includes backward support for PMI 1 and 2.
I’m supposed to complete that backport in my copious free time :-)
Until then, you can only launch via mpirun - which is just as
fast, actually, but does indeed have different cmd line options.
On Jan 5, 2016, at 9:22 PM, Christopher Samuel
<[email protected] <mailto:[email protected]>> wrote:
On 06/01/16 01:46, David Carlet wrote:
Depending on where you are in the design/development phase for your
project, you might also consider switching to using the OpenHPC
build.
Caution: for reasons that are unclear OpenHPC disables Slurm PMI
support:
https://github.com/openhpc/ohpc/releases/download/v1.0.GA/Install_guide-CentOS7.1-1.0.pdf
# At present, OpenHPC is unable to include the PMI process
# management server normally included within Slurm which
# implies that srun cannot be use for MPI job launch. Instead,
# native job launch mechanisms provided by the MPI stacks are
# utilized and prun abstracts this process for the various
# stacks to retain a single launch command.
Their spec file does:
# 6/16/15 [email protected]
<mailto:[email protected]> - do not package Slurm's version
of libpmi with OpenHPC.
%if 0%{?OHPC_BUILD}
rm -f $RPM_BUILD_ROOT/%{_libdir}/libpmi*
rm -f $RPM_BUILD_ROOT/%{_libdir}/mpi_pmi2*
%endif
--
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: [email protected] <mailto:[email protected]>
Phone: +61 (0)3 903 55545
http://www.vlsci.org.au/ http://twitter.com/vlsci