A silly question: Is it certain that the /etc/sysconfig/slurm file is sourced under a systemd setup in CentOS 7? I've been playing with setting various things to crazy values to confirm that things work the way I expect, and no ulimit or env var sets that I place in /etc/sysconfig/slurm appear to be visible on an srun/salloc/sbatch.
I notice that the slurm systemd service files point to /etc/default/slurm* for it's "EnvironmentFile" and that for the slurmd version, the file placed by the install includes a line for the SLURMD_OPTIONS var, which I see online a lot of people set in the /etc/sysconfig/slurm file. Are there different sourced files in the CentOS 7 / systemd setup for slurm? Sorry if this is a silly question. Paul. > On Jun 17, 2015, at 22:04, Wiegand, Paul <[email protected]> wrote: > > > Thanks Chris, these are useful. Our setup is like yours: we maintain > separate gcc and Intel Composer build chains for compatibility. Right now, > I'm solely focused on ic, gcc to follow at a later date ... then I get to > start again with MVAPICH2 (yay!). The most common applications of our users > rely on ic builds and OpenMPI, so I started there. > > My build parameters are a lot like yours. The only one I wonder about is > --without-scif. We've a mixture of Phi and non-Phi nodes, so I wasn't sure > how to set this one and right now I take the default (which I gather includes > SCIF). Do you have any insight as to your choice on this? > > For giggles I tried again just now, focusing on various nodes (both with Phi > and without), and the results are all the same (segfault). > > We also --disable_vt and --disable-pty-support on our OpenMPI build, but I > don't think these would cause the problem I'm seeing. Any disagreement with > that? > > As to Uwe's suggestion about the PMI plugin, I've built a number of different > ways, including with the PMI plugin. The libs are built and present, and I > can run as root without setting the resv-ports. When I build without PMI, I > set the resv-ports, and it still doesn't work. So I don't think PMI is the > issue, but I appreciate the suggestion. > > The fact that I can run as root and that I can run without openib (that is > using --mca btl ^openib on the mpirun call) suggests to me that there's some > kind of permissions / resource access problem to the IB. But I can't > understand why this would work fine outside of slurm but be a problem under > slurm. > > Someone at SSERCA suggested setting PropagateResourceLimits=NONE in the > slurm.conf file and opening up more than just memlock limits in the > /etc/sysconfig/slurm file. I did all that, but none of that solved anything. > > I'm stumped. > > Paul. > > > > >> On Jun 17, 2015, at 20:03, Christopher Samuel <[email protected]> wrote: >> >> >> On 18/06/15 00:38, Wiegand, Paul wrote: >> >>> We have just started experimenting with Slurm, and I'm having trouble >>> running OpenMPI jobs over Slurm. >> >> In case it helps Slurm here is configured with: >> >> ./configure --prefix=/usr/local/slurm/${slurm_ver} >> --sysconfdir=/usr/local/slurm/etc >> >> Open-MPI (1.6.x) is configured with: >> >> ./configure --prefix=/usr/local/${BASE} --with-slurm --with-openib >> --enable-static --enable-shared >> >> Our test build of 1.8.4 (using a different build strategy >> to separate out GCC and Intel builds to avoid the annoying >> incompatibility of Fortran MOD files for our one user who >> ran into it) is configured with: >> >> configure --prefix=/usr/local/openmpi-${COMPILER}/${VERSION} --with-slurm >> --with-verbs --enable-static --enable-shared --without-scif >> --with-pmi=/usr/local/slurm/latest >> >> Note that /usr/local/slurm/latest is a symlink to whatever >> /usr/local/slurm/${slurm_ver} is the current version we're >> running (currently 14.03.11). >> >> You will need to fix up your resource limit settings for >> maximum lockable memory too, but that shouldn't cause the >> issue you're seeing. >> >> Best of luck! >> Chris >> -- >> Christopher Samuel Senior Systems Administrator >> VLSCI - Victorian Life Sciences Computation Initiative >> Email: [email protected] Phone: +61 (0)3 903 55545 >> http://www.vlsci.org.au/ http://twitter.com/vlsci
