[slurm-dev] Re: Issue with Job Size Factor of Multifactor plugin

Moe Jette Wed, 22 Aug 2012 07:46:05 -0700

Quoting Miguel Méndez <[email protected]>:

> Hi,
>
> Job Size Factor in Multifactor Priority Plugin gets its value considering
> relative job size, and this size is relative to "node_record_count". The
> problems I see with this are two:
>
> - "node_record_count" includes my login node, which is never going to be
> used to run jobs. I would solve this by just substracting one to this value.


Only compute nodes are needed in your node list. The login node would  
generally not be included.


> - "node_record_count" includes all existing nodes in the cluster, doesn't
> matter if they are down. I think Job Size priority should be relative to
> the maximun size of a job that could be run if there were no other jobs
> running in the cluster. So if I have a 70 node cluster, with 2 nodes down,
> and a 10 node job, priority for this job should be 10/68, not 10/70.
>
> What would be the easiest way of getting the number of allocated or idle
> nodes? I have been trough slurmctld and sinfo code, but I understand they
> use loops for this, and I would prefer not having to do this every time I
> recalculate priorities.

bit_set_count(avail_node_bitmap) will give you the count of nodes up  
and available very quickly in the slurmctld daemon.


> Thanks,
>
> Miguel
>

[slurm-dev] Re: Issue with Job Size Factor of Multifactor plugin

Reply via email to