'ncpus' still exists but only in 17 'old' jobs - ones that were submitted
before we made the 'unset' change. I guess I should wait until these will
end and re-test the system?

diagnose -n says for example, on node28 :

node28         Busy   0:4     2926:3950        1:1        3871:7641
1.00DEFAUL [NONE] DEF
2.19 002 [heavy_2:4][light_4:4][b_que   [DEFAULT]
[NONE]
WARNING:  node 'node28' has more processors utilized than dedicated (4 > 2)
-----                     ---   6:86   72602:98716      26:26
142420:212774

But this node is running 2 jobs which both does not have 'ncpus' settings if
I use qstat -f on them.

About the MEM requirement: do you mean to unset it to? other than that we
don't use any MEM requierment in our qsub script.


On Jan 29, 2008 8:13 PM, Jan Ploski <[EMAIL PROTECTED]> wrote:

>
>
>
> I suppose you did check with qstat -f that 'ncpus' is not mentioned
> anywhere any longer?
>
>
>
> Maybe it has something to do with the MEM requirement (just a wild
> guess... but try removing it). What does diagnose -n say for a node
> which is incorrectly rejecting the job? Does it have enough free
> "tokens" (not sure if this is what they are called officially) to run
> the job in this b_que class?
>
> Regards,
> Jan Ploski
>
_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to