Re: [slurm-users] How to deal with jobs that need to be restarted several time

2019-03-13 Thread Selch, Brigitte (FIDF)
33 An: Slurm User Community List Betreff: Re: [slurm-users] How to deal with jobs that need to be restarted several time If the failures happen right after the job starts (or close enough), I’d use an interactive session with srun (or some other wrapper that calls srun, such as fisbatch). Our hpc

Re: [slurm-users] How to deal with jobs that need to be restarted several time

2019-03-12 Thread Renfro, Michael
If the failures happen right after the job starts (or close enough), I’d use an interactive session with srun (or some other wrapper that calls srun, such as fisbatch). Our hpcshell wrapper for srun is just a bash function: = hpcshell () { srun --partition=interactive $@ --pty bash -i

[slurm-users] How to deal with jobs that need to be restarted several time

2019-03-12 Thread Selch, Brigitte (FIDF)
Hello, Some jobs have to be restarted several times until they run. Users start the Job, it fails, they have to do some changes, they start the job again, it fails again ... and so on. So they want to keep the resources until the job is running properly. Is there a possibility to 'inherit' alloc