All,
I have a relatively successful cloud implementation of SLUM in Azure.

I am experiencing an issue with the ResumeProgram not running.
Thing work great but after a bit, it just plain stops calling the script. I
have enabled debug on slurmctld and I see the jobs being assigned nodes
that are idle~ but no calls to the script.

If I restart slurmctld, the backlog starts running and things work.

Any ideas what could cause this?

Brian Andrus

Reply via email to