On Tue, 21 Oct 2014 10:45:30 +0000
"Winkler, Ursula (ursula.wink...@uni-graz.at)" <ursula.wink...@uni-graz.at> 
wrote:

> Hi Reuti,
> 
> no - and there is no (other than reputed slots) resource shortage. And no 
> host is in error state.
> 
> Ursula
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Reuti [mailto:re...@staff.uni-marburg.de] 
> Gesendet: Dienstag, 21. Oktober 2014 12:25
> An: Gridengine Users Group
> Cc: Winkler, Ursula (ursula.wink...@uni-graz.at)
> Betreff: Re: [gridengine users] Weird scheduler calculation error?
> 
> Hi,
> 
> Am 21.10.2014 um 11:21 schrieb Ursula Winkler:
> 
> > Hi gridengine members,
> > 
> > For now I ran out of ideas with an annoying problem:
> > 
> > A job with 72 slots does not start because of "qstat -j <jobid>" tells 
> > "cannot run in PE "mpios" because it only offers 64 slots", but there are 
> > 72 free ("qalter -w p <jobid> and "qalter -w -v <jobid>" tells  
> > "verification: found possible assignment with 72 slots" and nothing more). 
> > The PE and host configurations are ok. I already restarted the 
> > master-process and the execution-daemons on the concerning hosts, but (as 
> > expected) that didn't help.
>
The qsub man page states that -w p and -w v don't take into account load 
values.  Possibly the job is requesting a complex whose value is determined by 
a load sensor and the returned value is not suitable but not causing an alarm.
 
Alternatively perhaps a higher priority job with a resource reservation in 
place?  We run with schedd_job_info false so I don't have much experience with 
it but I would expect qstat -j to take
reservations into account while qalter -w [pv] can't as they're basically just 
doing the calculation for that one job.


 

-- 
William Hay <w....@ucl.ac.uk>

Attachment: pgpuqTDu3tyJN.pgp
Description: PGP signature

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to