A silly question:  Is it certain that the /etc/sysconfig/slurm file is sourced 
under a systemd setup in CentOS 7?  I've been playing with setting various 
things to crazy values to confirm that things work the way I expect, and no 
ulimit or env var sets that I place in /etc/sysconfig/slurm appear to be 
visible on an srun/salloc/sbatch.

I notice that the slurm systemd service files point to  /etc/default/slurm*  
for it's "EnvironmentFile" and that for the slurmd version, the file placed by 
the install includes a line for the SLURMD_OPTIONS var, which I see online a 
lot of people set in the /etc/sysconfig/slurm file.

Are there different sourced files in the CentOS 7 / systemd setup for slurm?

Sorry if this is a silly question.

Paul.

> On Jun 17, 2015, at 22:04, Wiegand, Paul <[email protected]> wrote:
> 
> 
> Thanks Chris, these are useful.  Our setup is like yours:  we maintain 
> separate gcc and Intel Composer build chains for compatibility.  Right now, 
> I'm solely focused on ic, gcc to follow at a later date ... then I get to 
> start again with MVAPICH2 (yay!).  The most common applications of our users 
> rely on ic builds and OpenMPI, so I started there.
> 
> My build parameters are a lot like yours.  The only one I wonder about is 
> --without-scif.  We've a mixture of Phi and non-Phi nodes, so I wasn't sure 
> how to set this one and right now I take the default (which I gather includes 
> SCIF).  Do you have any insight as to your choice on this?
> 
> For giggles I tried again just now, focusing on various nodes (both with Phi 
> and without), and the results are all the same (segfault).
> 
> We also --disable_vt and --disable-pty-support on our OpenMPI build, but I 
> don't think these would cause the problem I'm seeing.  Any disagreement with 
> that?
> 
> As to Uwe's suggestion about the PMI plugin, I've built a number of different 
> ways, including with the PMI plugin.  The libs are built and present, and I 
> can run as root without setting the resv-ports.  When I build without PMI, I 
> set the resv-ports, and it still doesn't work.  So I don't think PMI is the 
> issue, but I appreciate the suggestion.
> 
> The fact that I can run as root and that I can run without openib (that is 
> using --mca btl ^openib on the mpirun call) suggests to me that there's some 
> kind of permissions / resource access problem to the IB.  But I can't 
> understand why this would work fine outside of slurm but be a problem under 
> slurm.
> 
> Someone at SSERCA suggested setting PropagateResourceLimits=NONE in the 
> slurm.conf file and opening up more than just memlock limits in the 
> /etc/sysconfig/slurm file.  I did all that, but none of that solved anything.
> 
> I'm stumped.
> 
> Paul.
> 
> 
> 
> 
>> On Jun 17, 2015, at 20:03, Christopher Samuel <[email protected]> wrote:
>> 
>> 
>> On 18/06/15 00:38, Wiegand, Paul wrote:
>> 
>>> We have just started experimenting with Slurm, and I'm having trouble
>>> running OpenMPI jobs over Slurm.
>> 
>> In case it helps Slurm here is configured with:
>> 
>> ./configure --prefix=/usr/local/slurm/${slurm_ver} 
>> --sysconfdir=/usr/local/slurm/etc
>> 
>> Open-MPI (1.6.x) is configured with:
>> 
>> ./configure --prefix=/usr/local/${BASE} --with-slurm --with-openib 
>> --enable-static  --enable-shared
>> 
>> Our test build of 1.8.4 (using a different build strategy
>> to separate out GCC and Intel builds to avoid the annoying
>> incompatibility of Fortran MOD files for our one user who
>> ran into it) is configured with:
>> 
>> configure --prefix=/usr/local/openmpi-${COMPILER}/${VERSION} --with-slurm 
>> --with-verbs --enable-static  --enable-shared --without-scif 
>> --with-pmi=/usr/local/slurm/latest
>> 
>> Note that /usr/local/slurm/latest is a symlink to whatever
>> /usr/local/slurm/${slurm_ver} is the current version we're
>> running (currently 14.03.11).
>> 
>> You will need to fix up your resource limit settings for
>> maximum lockable memory too, but that shouldn't cause the
>> issue you're seeing.
>> 
>> Best of luck!
>> Chris
>> -- 
>> Christopher Samuel        Senior Systems Administrator
>> VLSCI - Victorian Life Sciences Computation Initiative
>> Email: [email protected] Phone: +61 (0)3 903 55545
>> http://www.vlsci.org.au/      http://twitter.com/vlsci

Reply via email to