Re: [slurm-users] [External] Re: Status of BLCR?

2019-10-06 Thread George Wm Turner
I stumbled across CRIU (Checkpoint/Restore In Userspace) https://criu.org/Main_Page a couple of weeks ago. I have not utilized it yet it; it's on my ToDo list. They claim that it’s packaged with most distress; I checked RHEL/CentOS and it was there. Be careful of

Re: [slurm-users] [External] Re: Status of BLCR?

2019-10-06 Thread Eliot Moss
On 10/6/2019 9:23 AM, George Wm Turner wrote: I stumbled across CRIU (Checkpoint/Restore In Userspace) https://criu.org/Main_Page a couple of weeks ago.  I have not utilized it yet it; it's on my ToDo list. They claim that it’s packaged with most distress;  I checked RHEL/CentOS and it was there

Re: [slurm-users] Status of BLCR?

2019-10-06 Thread Chris Samuel
On 4/10/19 7:46 pm, Eliot Moss wrote: From what I have read, BLCR would meet my needs for checkpointing, but the admins of both clusters are reluctant to pursue BLCR support. I myself am wondering whether it is still working, etc., and what it means that built-in support has been removed, etc.

[slurm-users] srun: Error generating job credential

2019-10-06 Thread Eddy Swan
Hi All, I am currently testing slurm version 19.05.3-2 on Centos 7 with one master and 3 nodes configuration. I used the same configuration that works on version 17.02.7 but for some reasons, it does not work on 19.05.3-2. $ srun hostname srun: error: Unable to create step for job 19: Error gener

Re: [slurm-users] srun: Error generating job credential

2019-10-06 Thread Marcus Wagner
Hi Eddy, what is the result of "id 1000" on the submithost and on piglet-18? Best Marcus On 10/7/19 8:07 AM, Eddy Swan wrote: Hi All, I am currently testing slurm version 19.05.3-2 on Centos 7 with one master and 3 nodes configuration. I used the same configuration that works on version 17.0