tasks.
You see the result in your variables:
SLURM_NNODES=3
SLURM_JOB_CPUS_PER_NODE=1(x3)
If you only want 2 nodes, make --nodes=2.
Brian Andrus
On 8/29/24 08:00, Matteo Guglielmi via slurm-users wrote:
Hi,
On sbatch's manpage there is this example for :
--nodes=1,5,9,13
so eit
Looks like it ignored that and used ntasks with ntasks-per-node as 1, giving
you 3 nodes. Check your logs and check your conf see what your defaults are.
Brian Andrus
On 8/29/2024 5:04 AM, Matteo Guglielmi via slurm-users wrote:
Hello,
I have a cluster with four Intel nodes (node[01-04], Fe
Hello,
I have a cluster with four Intel nodes (node[01-04], Feature=intel) and four
Amd nodes (node[05-08], Feature=amd).
# job file
#SBATCH --ntasks=3
#SBATCH --nodes=2,4
#SBATCH --constraint="[intel|amd]"
env | grep SLURM
# slurm.conf
PartitionName=DEFAULT MinNodes=1 MaxNodes=UNLIMITED
Hello,
Does anyone know why this is possible in slurm:
--constraint="[rack1*2&rack2*4]"
and this is not:
--constraint="[rack1*2|rack2*4]"
?
Thank you.
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
Just replying to my own mail here:
RTFM
So,
It was enough to add the following:
SchedulerParameters=allow_zero_lic
in slurm.conf
From: slurm-users on behalf of Matteo
Guglielmi
Sent: Friday, March 24, 2023 3:03:35 PM
To: slurm-us...@schedmd.com
Dear all,
we have a license server which is allocating licenses to a bunk of workstation
not managed with slurm (completely independent boxes) and the nodes of a
cluster,
all managed with slurm.
I wrote a simple script that keeps querying the number of licenses used by the
outside "world" a
at the same time on all nodes
A great detective story!
> June15 but there is no trace of it anywhere on the disk.
Do you have the process ID (pid) of the watchdog.sh
You could look in /proc/(pid) /cmdline and see what that shows
On 2 July 2018 at 11:37, Matteo Guglielmi
mailto:
reating tasks on the other machines?
I would look at the compute nodes while the job is running and do
ps -eaf --forest
Also using mpirun to run a single core gives me the heebie-jeebies...
https://en.wikipedia.org/wiki/Heebie-jeebies_(idiom)
On 29 June 2018 at 13:16, Matteo Guglielmi
mailto:
r processes which are started by it on
the other compute nodes will be killed.
I suspect your user is trying to do womething "smart". You should give that
person an example of how to reserve 36 cores and submit a charmm job.
On 29 June 2018 at 12:13, Matteo Guglielmi
mailto:matteo.g
Dear comunity,
I have a user who usually submits 36 (identical) jobs at a time using a simple
for loop,
thus jobs are sbatched all the same time.
Each job requests a single core and all jobs are independent from one another
(read
different input files and write to different output files).
Jobs
10 matches
Mail list logo