> On Jan 6, 2016, at 9:53 AM, Bruce Roberts <[email protected]> wrote: > > PMIx sounds really nice. > > Forgive my naive question, but for mpirun would sstat and step accounting > continue to work as it does when using srun?
It does to an extent. You generally execute an mpirun for each job step. The mpirun launches its own daemons for each invocation, and the app procs are children of these daemons. So Slurm sees the daemons and will aggregate the accounting for its children into the daemon’s usage. However, the daemon mostly just sleeps once the app is running, and so the accounting should be okay (though you won’t get it for each individual app process). Perhaps others out there who have used this can chime in with their experience? > Does mpirun also support Slurm's task placement/layout/binding/signaling? > Our users use most of the features quite heavily as I am guessing others do > as well. What mpirun supports depends on the MPI implementation, so I can only address your question for OpenMPI. You’ll find that OMPI’s mpirun provides a superset of Slurms options (i.e., we implemented a broader level of support), but the names and syntax of those options is different as it reflects that broader support. For example, we have the ability to allow more fine-grained layout/binding patterns and combinations. > > Thanks! > > On 01/06/16 07:54, Ralph Castain wrote: >> As with all such rumors, there is some truth and some inaccuracies to it. >> Note that the various MPIs have historically differed significantly in how >> they implement mpirun, though the differences in terms of behavior and >> performance have been closing. So it is hard to provide a clearcut answer >> that spans time, and I’ll just report where we are now and looking ahead a >> bit. >> >> PMI-1 support doesn't scale as well as what was done in mpirun from some of >> the MPI libraries, and so your (A) is certainly true. Remember that Slurm >> provides PMI-1 out-of-the-box and that you have to do a second build step to >> add PMI-2 support. So for people that just do the std install and run, this >> will be the expected situation. >> >> For those that install PMI-2 (or the new extended PMI-2 for MVAPICH), you’ll >> see some improved performance. I suspect you’ll find that srun and mpirun >> are pretty close to each other at that point, and the choice really just >> comes down to your desired cmd line options. >> >> The test results with PMIx indicate that the performance gap between direct >> (srun) launch and indirect (mpirun) launch is pretty much gone. You have to >> remember that the overhead of mapping the job isn’t very large (and the time >> is roughly equal anyway), and that both srun and mpirun distribute the >> launch cmd in the same way (via a tree-based algorithm). Likewise, both >> involve starting a user-level daemon and wiring those up. >> >> So when you break down the steps, and given that mpirun and srun are using >> the same wireup support, you can see that the two should be equivalent. >> Really just a question of which cmd line options you prefer. >> >> HTH >> Ralph >> >> >>> On Jan 6, 2016, at 6:03 AM, Novosielski, Ryan <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Since this is an audience that might know, and this is related (but >>> off-topic, sorry): is there any truth to the suggestions on the Internet >>> that using srun is /slower/ than mpirun/mpiexec? There were some old >>> mailing list messages someplace that seem to indicate A) yes, in the old >>> days of PMI1 only or B) likely it was a misconfigured system in the first >>> place. I haven't found anything definitive though and those threads sort of >>> petered out without an answer. >>> >>> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences* >>> || \\UTGERS <smb://UTGERS> >>> |---------------------*O*--------------------- >>> ||_// Biomedical | Ryan Novosielski - Senior Technologist >>> || \\ and Health | [email protected] <mailto:[email protected]>- >>> 973/972.0922 (2x0922) >>> || \\ Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark >>> `' >>> >>> On Jan 6, 2016, at 01:43, Ralph Castain <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>>> >>>> Simple reason, Chris - the PMI support is GPL 2.0, and so anything built >>>> against it automatically becomes GPL. So OpenHPC cannot distribute Slurm >>>> with those libraries. >>>> >>>> Instead, we are looking to use the new PMIx library to provide wireup >>>> support, which includes backward support for PMI 1 and 2. I’m supposed to >>>> complete that backport in my copious free time :-) >>>> >>>> Until then, you can only launch via mpirun - which is just as fast, >>>> actually, but does indeed have different cmd line options. >>>> >>>> >>>>> On Jan 5, 2016, at 9:22 PM, Christopher Samuel <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> >>>>> On 06/01/16 01:46, David Carlet wrote: >>>>> >>>>>> Depending on where you are in the design/development phase for your >>>>>> project, you might also consider switching to using the OpenHPC build. >>>>> >>>>> Caution: for reasons that are unclear OpenHPC disables Slurm PMI support: >>>>> >>>>> https://github.com/openhpc/ohpc/releases/download/v1.0.GA/Install_guide-CentOS7.1-1.0.pdf >>>>> >>>>> <https://github.com/openhpc/ohpc/releases/download/v1.0.GA/Install_guide-CentOS7.1-1.0.pdf> >>>>> >>>>> # At present, OpenHPC is unable to include the PMI process >>>>> # management server normally included within Slurm which >>>>> # implies that srun cannot be use for MPI job launch. Instead, >>>>> # native job launch mechanisms provided by the MPI stacks are >>>>> # utilized and prun abstracts this process for the various >>>>> # stacks to retain a single launch command. >>>>> >>>>> Their spec file does: >>>>> >>>>> # 6/16/15 [email protected] <mailto:[email protected]> - do >>>>> not package Slurm's version of libpmi with OpenHPC. >>>>> %if 0%{?OHPC_BUILD} >>>>> rm -f $RPM_BUILD_ROOT/%{_libdir}/libpmi* >>>>> rm -f $RPM_BUILD_ROOT/%{_libdir}/mpi_pmi2* >>>>> %endif >>>>> >>>>> >>>>> >>>>> -- >>>>> Christopher Samuel Senior Systems Administrator >>>>> VLSCI - Victorian Life Sciences Computation Initiative >>>>> Email: [email protected] <mailto:[email protected]> Phone: +61 >>>>> (0)3 903 55545 >>>>> http://www.vlsci.org.au/ <http://www.vlsci.org.au/> >>>>> http://twitter.com/vlsci <http://twitter.com/vlsci> >> >
