Hello guys !
I'm currently running a 2U server with 8x GTX 1080Ti inside and SLURM
17.02.7 on top of that. But i can't figure out why my slurm never execute
more than 4 jobs at the same time, other jobs are PENDING in queue, even if
the MaxJobs=8 .
Does someone have an idea ?
Thank you very much
..
So it's all my fault, Slurm did his job to startclean when the logrotate
triggered it.
Sorry for that !
On Thu, Sep 29, 2016 at 2:05 PM, Janne Blomqvist <janne.blomqv...@aalto.fi>
wrote:
> On 2016-09-27 10:39, Philippe wrote:
> > If I can't use logrotate, what must I use ?
&
tis <desan...@usf.edu> wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Philippe,
>
> > But every 3 days, the slurmctld process restarts by itself, as you
> > can see in slurmctld.log :
>
> ... SNIP ...
>
> > No crontab set, anythin
:)
Thank you very much
Regards,
Philippe
$SLURMD_OPTIONS
PIDFile=/var/run/slurmd.pid
LimitMEMLOCK=infinity
[Install]
WantedBy=multi-user.target
Hope this helps,
Philippe
- Mail original -
De: Christopher Samuel sam...@unimelb.edu.au
À: slurm-dev slurm-dev@schedmd.com
Envoyé: Mardi 23 Juin 2015 01:50:50
Objet: [slurm
.
It has its own task identifieres, and users want to see this ID in the log
file name.
Can I manually call srun -o logfile for each task of an array ?
Thank you for your help !
Philippe
: user philippe (uid=555) has no active jobs.
Connection closed by 10.0.0.34
$ squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
...
1748 debug job_open philippe R 0:05 1 node-05-05
$ ssh node-05-05
Last login: Thu Sep 26 18:06:28
Thanks a lot for your quick answer.
Is there any trick to limit the number of tasks running concurrently in
an array job ? Either on a per user basis or system wide.
Regards,
Philippe and Georges
Le 13/09/2013 20:08, Moe Jette a écrit :
Slurm supports a large number of limits
, is
there a way with SLURM to:
- either allow a user to limit himself the max number of running
tasks ?
- or set a limit cluster wide for every user ?
Any suggestion would be most welcome,
Regards,
Philippe and Georges
PS: Previously, we used to run SGE on this cluster. With SGE