On Tue, 21 Oct 2014 10:45:30 +0000 "Winkler, Ursula (ursula.wink...@uni-graz.at)" <ursula.wink...@uni-graz.at> wrote:
> Hi Reuti, > > no - and there is no (other than reputed slots) resource shortage. And no > host is in error state. > > Ursula > > > -----Ursprüngliche Nachricht----- > Von: Reuti [mailto:re...@staff.uni-marburg.de] > Gesendet: Dienstag, 21. Oktober 2014 12:25 > An: Gridengine Users Group > Cc: Winkler, Ursula (ursula.wink...@uni-graz.at) > Betreff: Re: [gridengine users] Weird scheduler calculation error? > > Hi, > > Am 21.10.2014 um 11:21 schrieb Ursula Winkler: > > > Hi gridengine members, > > > > For now I ran out of ideas with an annoying problem: > > > > A job with 72 slots does not start because of "qstat -j <jobid>" tells > > "cannot run in PE "mpios" because it only offers 64 slots", but there are > > 72 free ("qalter -w p <jobid> and "qalter -w -v <jobid>" tells > > "verification: found possible assignment with 72 slots" and nothing more). > > The PE and host configurations are ok. I already restarted the > > master-process and the execution-daemons on the concerning hosts, but (as > > expected) that didn't help. > The qsub man page states that -w p and -w v don't take into account load values. Possibly the job is requesting a complex whose value is determined by a load sensor and the returned value is not suitable but not causing an alarm. Alternatively perhaps a higher priority job with a resource reservation in place? We run with schedd_job_info false so I don't have much experience with it but I would expect qstat -j to take reservations into account while qalter -w [pv] can't as they're basically just doing the calculation for that one job. -- William Hay <w....@ucl.ac.uk>
pgpuqTDu3tyJN.pgp
Description: PGP signature
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users