On the one hand, you say you want "to *allocate a whole node* for a single
multi-threaded process," but on the other you say you want to allow it
to "*share
nodes* with other running jobs." Those seem like mutually exclusive
requirements.
Jason
On Thu, Aug 1, 2024 at 1:32 PM Henrique Almeida via
Hello all,
The Slurm docs have me a bit confused... I'm wanting to enable job
preemption on certain partitions but not others. I *presume* I would
set PreemptType=preempt/partition_prio globally, but then on the partitions
where I don't want jobs to be able to be preempted, I would set
user root in place?
>
> sreport accounts resources reserved for a user as well (even if not
> used by jobs) while sacct reports job accounting only.
>
> Best regards
> Jürgen
>
>
> * Jason Simms via slurm-users [240429
> 10:47]:
> > Hello all,
> >
> > E
Hello all,
Each week, I generate an automated report of the top users by CPU hours.
This week, for whatever reason the user root accounted for a massive number
of hours:
Login Proper Name Used
As a related point, for this reason I mount /var/log separately from /. Ask
me how I learned that lesson...
Jason
On Tue, Apr 16, 2024 at 8:43 AM Jeffrey T Frey via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n"
> is
Hello Matthew,
You may be aware of this already, but most sites would make these kinds of
checks/validations using job_submit.lua. I'm not an expert in that - though
plenty of others on this list are - but I'm positive you could implement
this type of validation logic. I'd like to say that I've
Hello Thomas,
I know I'm a few days late to this, so I'm wondering whether you've made
any progress. We experience this, too, but in a different way.
First, though, you may be aware, but you should use salloc rather than srun
--pty for an interactive session. That's been the preferred method for
Hello Daniel,
In my experience, if you have a high-speed interconnect such as IB, you
would do IPoIB. You would likely still have a "regular" Ethernet connection
for management purposes, and yes that means both an IB switch and an
Ethernet switch, but that switch doesn't have to be anything
Hello all,
I've used the "scontrol write batch_script" command to output the job
submission script from completed jobs in the past, but for some reason, no
matter which job I specify, it tells me it is invalid. Any way to
troubleshoot this? Alternatively, is there another way - even if a manual