Lou,
Are you installing on the same machine you built?
Are the nvidia libraries installed by RPM or a 'make install' on the box
you compiled it on?
Brian Andrus
On 8/15/2019 7:53 AM, Lou Nicotra wrote:
I have tried running ldconfig manually as suggested with
slurm-19.05.1-2 and it fails the
Ya, I saw that it was almost removed before 19.05. I didn't know about the NEWS
file! Yep its right there, mea culpa; I'll check that in the future!
Best,
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
On 8/15/19, 11:08 AM, "slurm-users on beha
On 8/15/19 11:02 AM, Mark Hahn wrote:
it's in NEWS, if that counts. also, I note that at least in this commit,
--chdir is added but --workdir is not removed from option parsing.
It went away here:
commit 9118a41e13c2dfb347c19b607bcce91dae70f8c6
Author: Tim Wickberg
Date: Tue Mar 12 23:20:
Looks like the commit is here:
https://github.com/SchedMD/slurm/commit/fddc98533c1f3753e5e43ad6a16407c5cb8c8de8
Yet, no change log on it. Very frustrating.
it's in NEWS, if that counts. also, I note that at least in this commit,
--chdir is added but --workdir is not removed from option parsin
On 8/15/19 7:18 AM, Sajdak, Doris wrote:
Thanks Chris! That worked. We'd tried IP address but not FQDN.
Great to hear!
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
I've attached a first draft of a patch to allow network interface
selection for slurm RPCs.
As part of the bring up for one of our systems here we've been wanting
to switch between our management network and IP over IB. As far as I
can tell Slurm doesn't allow the user to select a network interfac
>I have tried running ldconfig manually as suggested with slurm-19.05.1-2 and
>it fails the same way... >error: Failed dependencies:>
>libnvidia-ml.so.1()(64bit) is needed by slurm-19.05.1-2.el7.centos.x86_64
Lou, that's a packaging mistake on the part of the person who created that
Looks like the commit is here:
https://github.com/SchedMD/slurm/commit/fddc98533c1f3753e5e43ad6a16407c5cb8c8de8
Yet, no change log on it. Very frustrating.
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
On 8/14/19, 1:30 PM, "slurm-users on beh
I have tried running ldconfig manually as suggested with
slurm-19.05.1-2 and it fails the same way...
error: Failed dependencies:
libnvidia-ml.so.1()(64bit) is needed by
slurm-19.05.1-2.el7.centos.x86_64
ldconfig -p shows:
root@panther02 slurm# ldconfig -p|grep libnvidia-ml.
libnvi
Thanks Chris! That worked. We'd tried IP address but not FQDN.
Dori
-Original Message-
From: slurm-users On Behalf Of
Christopher Samuel
Sent: Wednesday, August 14, 2019 5:11 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] AllocNodes on partition no longer working
On
Hello,
the docu for heterogeneous jobs [1] says that the envVar SLURM_JOB_ID
should be different for each component. However, I cannot reproduce this
on a fresh slurm-19.05.1 installation.
$ salloc -pcompute -N1 : -pcompute2 -N1
[...]
salloc: Granted job allocation 108453
[...]
bash-4.1$ sque
Christopher Benjamin Coffey writes:
> It seems that --workdir= is no longer a valid option in batch jobs and
> srun in 19.05, and has been replaced by --chdir. I didn't see a change
> log about this, did I miss it? Going through the man pages it seems it
> hasn't existed for some time now actuall
12 matches
Mail list logo