Don,

Thanks for the answer and thanks to Moe as well, it's exactly what I needed.

I also have thought about this you say, but this "injustice" would only be
happening until the next time priorities are recalculated (I have it set to
be one minute). So I think it would be very rare to see a situation where
this second job you mention is run before the previous job, because two
things would need to happen into that minute: some nodes go from down to
up, and some job(s) finish its execution so its nodes become available in
the cluster.

Regards,

Miguel

On Wed, Aug 22, 2012 at 5:47 PM, Lipari, Don <[email protected]> wrote:

> Miguel,
>
> While you make a good argument to base the job size factor on the number
> of active nodes, this could confuse users.  You may get users who wonder
> why their priority is too low and look to you to explain how their job
> priority was calculated.  It is easier to say that that the job size
> component is the number of nodes their job requested / the (fixed) number
> of nodes in the cluster.
>
> As the number of active nodes varies over time, you could have a 10 node
> job receive a job size factor of 10/68, and an instant later have the two
> bad nodes come back up.  Assuming a second, 10 node job is submitted, it
> would receive a job size factor of 10/70.  The user of the second job could
> complain that that is not fair.
>
> Don
>
> > -----Original Message-----
> > From: Moe Jette [mailto:[email protected]]
> > Sent: Wednesday, August 22, 2012 7:59 AM
> > To: slurm-dev
> > Subject: [slurm-dev] Re: Issue with Job Size Factor of Multifactor plugin
> >
> >
> > Quoting Miguel Méndez <[email protected]>:
> >
> > > Hi,
> > >
> > > Job Size Factor in Multifactor Priority Plugin gets its value
> considering
> > > relative job size, and this size is relative to "node_record_count".
> The
> > > problems I see with this are two:
> > >
> > > - "node_record_count" includes my login node, which is never going to
> be
> > > used to run jobs. I would solve this by just substracting one to this
> value.
> >
> > Only compute nodes are needed in your node list. The login node would
> > generally not be included.
> >
> >
> > > - "node_record_count" includes all existing nodes in the cluster,
> doesn't
> > > matter if they are down. I think Job Size priority should be relative
> to
> > > the maximun size of a job that could be run if there were no other jobs
> > > running in the cluster. So if I have a 70 node cluster, with 2 nodes
> down,
> > > and a 10 node job, priority for this job should be 10/68, not 10/70.
> > >
> > > What would be the easiest way of getting the number of allocated or
> idle
> > > nodes? I have been trough slurmctld and sinfo code, but I understand
> they
> > > use loops for this, and I would prefer not having to do this every
> time I
> > > recalculate priorities.
> >
> > bit_set_count(avail_node_bitmap) will give you the count of nodes up
> > and available very quickly in the slurmctld daemon.
> >
> >
> > > Thanks,
> > >
> > > Miguel
> > >
>

Reply via email to