Hi Roger
thanks for your answer but it doesn't work in our case and I don't
understand why.
Angelines Alberto Morillas
Unidad de Arquitectura Informática
Despacho: 22.1.32
Telf.: +34 91 346 6119
Fax: +34 91 346 6537
skype: angelines.alberto
So it seems nss_slurm does not play well with sudo.
If I connect to a box that uses it and try to use sudo, I get:
*sudo: PAM account management error: Authentication service cannot retrieve
authentication info*
Has anyone else seen this?
Is there a workaround?
Brian Andrus
Jeff,
Create a qos with maxjobs defined.
https://slurm.schedmd.com/qos.html
https://wiki.fysik.dtu.dk/niflheim/Slurm_accounting#quality-of-service-qos
If you haven't used slurm qos before, you may want to check out the other limits
possible, they are more flexible than maxjobs.
Add the qos to
Have you looked at the limits you can set at the QOS or Account level in
slurmdbd? There seems to be better granularity at those levels from what I've
seen.
--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 8
I need to set up a partition that limits the number of jobs allowed to run at
one time. Looking at the slurm.conf page for partition definitions I don't
see a MaxJobs option.
Is there a way to limit the number of jobs in a partition?
Thanks, Jeff
I had found some inconsistent behavior with the epilog that I didn't
understand, but we worked around it at our site and didn't follow up.
https://bugs.schedmd.com/show_bug.cgi?id=6911
On Mon, Dec 9, 2019 at 11:58 AM Brian Andrus wrote:
> Absolutely, which we do, however it is difficult to simul
Absolutely, which we do, however it is difficult to simulate all the
possible job failures/ending/cancellations in 2 minutes or at all for
some things.
So we post on the forums both to see if this has been found out and to
draw attention to the fact that the documentation could be improved by
At the risk of stating the obvious… these seem like the sort of questions that
could be answered with a 2 minute test. Better yet, not just answered, but with
answers specific to your configuration ☺
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
Alex Chekholko
Se
Hi,
I had asked a similar question recently (maybe a year ago) and also got
crickets. I think in our case we were not able to ensure that the epilog
always ran for different types of job failures, so we just had the users
add some more cleanup code to the end of their jobs _and_ also run separate
Hi,
We would like to bind slurm to a specific address and thought
NodeAddr=1.2.3.4
CommunicationParameters=NoInAddrAny
would be a good idea. However the manpage says:
"NoInAddrAny - Used to directly bind to the address of what the node
resolves to instead of binding messages to any addre
Open MPI matches available hardware in node(s) against its compiled-in
capabilities. Those capabilities are expressed as modular shared libraries
(see e.g. $PREFIX/lib64/openmpi). You can use environment variables or
command-line flags to influence which modules get used for specific purposed.
Hi mercan,
OK, I forgot to compile OpenMPI with Infiniband support... But I still
have a doubt: SLURM scheduler assigns (offers) some nodes called
"node0x" to my sbatch job because in my SLURM cluster nodes have been
added with "node0x" name. My OpenMPI application has been (now) compiled
wit
12 matches
Mail list logo