Don, Thanks for the answer and thanks to Moe as well, it's exactly what I needed.
I also have thought about this you say, but this "injustice" would only be happening until the next time priorities are recalculated (I have it set to be one minute). So I think it would be very rare to see a situation where this second job you mention is run before the previous job, because two things would need to happen into that minute: some nodes go from down to up, and some job(s) finish its execution so its nodes become available in the cluster. Regards, Miguel On Wed, Aug 22, 2012 at 5:47 PM, Lipari, Don <[email protected]> wrote: > Miguel, > > While you make a good argument to base the job size factor on the number > of active nodes, this could confuse users. You may get users who wonder > why their priority is too low and look to you to explain how their job > priority was calculated. It is easier to say that that the job size > component is the number of nodes their job requested / the (fixed) number > of nodes in the cluster. > > As the number of active nodes varies over time, you could have a 10 node > job receive a job size factor of 10/68, and an instant later have the two > bad nodes come back up. Assuming a second, 10 node job is submitted, it > would receive a job size factor of 10/70. The user of the second job could > complain that that is not fair. > > Don > > > -----Original Message----- > > From: Moe Jette [mailto:[email protected]] > > Sent: Wednesday, August 22, 2012 7:59 AM > > To: slurm-dev > > Subject: [slurm-dev] Re: Issue with Job Size Factor of Multifactor plugin > > > > > > Quoting Miguel Méndez <[email protected]>: > > > > > Hi, > > > > > > Job Size Factor in Multifactor Priority Plugin gets its value > considering > > > relative job size, and this size is relative to "node_record_count". > The > > > problems I see with this are two: > > > > > > - "node_record_count" includes my login node, which is never going to > be > > > used to run jobs. I would solve this by just substracting one to this > value. > > > > Only compute nodes are needed in your node list. The login node would > > generally not be included. > > > > > > > - "node_record_count" includes all existing nodes in the cluster, > doesn't > > > matter if they are down. I think Job Size priority should be relative > to > > > the maximun size of a job that could be run if there were no other jobs > > > running in the cluster. So if I have a 70 node cluster, with 2 nodes > down, > > > and a 10 node job, priority for this job should be 10/68, not 10/70. > > > > > > What would be the easiest way of getting the number of allocated or > idle > > > nodes? I have been trough slurmctld and sinfo code, but I understand > they > > > use loops for this, and I would prefer not having to do this every > time I > > > recalculate priorities. > > > > bit_set_count(avail_node_bitmap) will give you the count of nodes up > > and available very quickly in the slurmctld daemon. > > > > > > > Thanks, > > > > > > Miguel > > > >
