[slurm-users] Re: Issues with Slurm 23.11.1

2024-03-15 Thread Fokke Dijkstra via slurm-users
, Fokke Op wo 24 jan 2024 om 16:19 schreef Fokke Dijkstra : > Dear Brian, > > Thanks for the hints, I think you are correctly pointing at some network > connection issue. I've disabled firewalld on the control host, but that > unfortunately did not help. The processes stuck i

Re: [slurm-users] after upgrade to 23.11.1 nodes stuck in completion state

2024-01-30 Thread Fokke Dijkstra
completed epilog for jobid 3679888 > [2024-01-28T17:33:58.774] debug: JobId=3679888: sent epilog complete msg: > rc = 0 > > > -- Paul Raines (http://help.nmr.mgh.harvard.edu) > > > > Please note that this e-mail is not secure (encrypted). If you do not > wish to c

Re: [slurm-users] Issues with Slurm 23.11.1

2024-01-24 Thread Fokke Dijkstra
(though the firewall > is not between those two layers). > > -- > Brian D. Haymore > University of Utah > Center for High Performance Computing > 155 South 1452 East RM 405 > Salt Lake City, Ut 84112 > Phone: 801-558-1150 > http://bit.ly/1HO1N2C > ---

[slurm-users] Issues with Slurm 23.11.1

2024-01-23 Thread Fokke Dijkstra
. This leads to many job failures. The issue appears to be somewhat similar to the one described at: https://bugs.schedmd.com/show_bug.cgi?id=18561 In that case the site downgraded the slurmd clients to 22.05 which got rid of the problems. We’ve now downgraded the slurmd on the compute nodes to 23.02