Quoting "V. Ram" <[email protected]>:

>
> Hello Slurm Folks,
>
> In a mailing list thread earlier this year (
> https://groups.google.com/d/topic/slurm-devel/zaH9aXhBXLA/discussion ),
> Moe Jette mentioned that at the time, nodes assigned to multiple
> partitions could have their resources overcommitted if the
> Shared=Exclusive parameter was not set on all partitions including the
> nodes.
>
> Is this limitation still present in Slurm?  If not, in which version was
> it addressed?

That feature is still present in SLURM. Each SLURM partition maintains  
an independent bitmap of available resources. This is needed for job  
preemption, where a job in a lower priority partition gets preempted  
for a job in a higher priority partition if the job preemption mode is  
configured to be suspend/resume.


> We would very strongly like to have nodes in multiple partitions in such
> a way that the consumable resource on each node could be either sockets
> or cores, without needing to worry about overcommitment.

Using one partition and QOS may satisfy your needs. See
http://www.schedmd.com/slurmdocs/qos.html


> There is a table in the documentation on
> http://www.schedmd.com/slurmdocs/cons_res_share.html which implies that
> if Shared=No , we should see the behavior desired, but the bottom of
> that documentation page indicates a last modified date in 2008, which is
> older than the thread I linked above with Moe's comments, so I'm still
> pretty uncertain.

I will try to make this more clear.


> Thank you.
>
> --
> http://www.fastmail.fm - mmm... Fastmail...
>

Reply via email to