Re: [slurm-users] Wrong hwloc detected?

2021-11-09 Thread Chris Samuel

On 5/11/21 4:47 am, Diego Zuccato wrote:


How can Slurm detect such an old HWLOC version?


Looking at the code it's not actually checking the hwloc version, it's 
finding an error condition and suggesting that may be the cause, but it 
sounds like it's not for you.


src/plugins/task/cgroup/task_cgroup_cpuset.c :

/* should never happen in normal scenario */
if ((sock_loop > npdist) && !hwloc_success) {
/* hwloc_get_obj_below_by_type() fails if no CPU set
 * configured, see hwloc documentation for details */
error("hwloc_get_obj_below_by_type() failing, "
  "task/affinity plugin may be required to address 
bug "

  "fixed in HWLOC version 1.11.5");
return XCGROUP_ERROR;
} [...]


If you've got support from SchedMD open a bug with them, but if not and 
you're using the Debian packages I'd suggest opening a bug with Debian 
about it.


Best of luck!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] How to get an estimate of job completion for planned maintenance?

2021-11-09 Thread Chris Samuel

On 9/11/21 5:42 am, Loris Bennett wrote:


We just set up a reservation at a point at a time which is further in the
future than our maximum run-time.  There is then no need to drain
anything.  Short running jobs can still run right up to the reservation.


This is the same technique we use too, works well!

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] How to get an estimate of job completion for planned maintenance?

2021-11-09 Thread Loris Bennett
Hi Ahmed,

Ahmad Khalifa  writes:

> If I plan maintenance on a certain day, how long before that day
> should I set the queue to drain mode?! Is there a way to estimate the
> completion date / time of current running jobs?!

We just set up a reservation at a point at a time which is further in the
future than our maximum run-time.  There is then no need to drain
anything.  Short running jobs can still run right up to the reservation.

Cheers,

Loris

-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



Re: [slurm-users] How to get an estimate of job completion for planned maintenance?

2021-11-09 Thread Ole Holm Nielsen

On 11/9/21 13:55, Marcus Wagner wrote:
I have written a script, which loops through all runnning jobs to tell me, 
when a job ends on a specific node. This can be also done for all nodes. 
The output would be for the longest job e.g.:


ncm0430  -> 2021-12-04T15:48:35

Nonetheless, we also plan maintenances with reservations, we do not drain 
the partitions.


The pestat script from 
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/pestat also 
print job ending times on nodes:


$ pestat -E | sort -k 11

You can make all sorts of node selections with the other pestat options.

/Ole



Re: [slurm-users] How to get an estimate of job completion for planned maintenance?

2021-11-09 Thread Marcus Wagner

I have written a script, which loops through all runnning jobs to tell me, when 
a job ends on a specific node. This can be also done for all nodes. The output 
would be for the longest job e.g.:

ncm0430  -> 2021-12-04T15:48:35

Nonetheless, we also plan maintenances with reservations, we do not drain the 
partitions.


Best
Marcus


Am 05.11.2021 um 23:16 schrieb Ahmad Khalifa:

If I plan maintenance on a certain day, how long before that day should I set 
the queue to drain mode?! Is there a way to estimate the completion date / time 
of current running jobs?!

Regards.


--
Dipl.-Inf. Marcus Wagner

IT Center
Gruppe: Server, Storage, HPC
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de

Social Media Kanäle des IT Centers:
https://blog.rwth-aachen.de/itc/
https://www.facebook.com/itcenterrwth
https://www.linkedin.com/company/itcenterrwth
https://twitter.com/ITCenterRWTH
https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ


smime.p7s
Description: S/MIME Cryptographic Signature