Re: [Toolserver-l] SGE queues stalled

Merlissimo Wed, 05 Dec 2012 08:57:18 -0800

Am 05.12.2012 16:21, schrieb Morten Wang:

Is there a way for me to find that out myself, e.g. using qstat?  I had a
look at the qstat man-page, but judging by the descriptions it looks like
something I'd have to fiddle around with if/when a job gets queued for a
long time at some point in the future to figure out how to do.


qstat -j <jobnumber>

lists a scheduling info section.

Example:
qstat -j 799111

scheduling info:

queue instance "short-...@ortelius.toolserver.org" dropped because it isoverloaded: np_load_short=1.252930 (= 1.252930 + 0.8 * 0.000000 withnproc=4) >= 1.1queue instance "longrun-...@willow.toolserver.org" dropped because it isoverloaded: np_load_short=2.528320 (= 2.528320 + 0.8 * 0.000000 withnproc=8) >= 2.0queue instance "medium-...@ortelius.toolserver.org" dropped because itis overloaded: np_load_short=1.252930 (= 1.252930 + 0.8 * 0.000000 withnproc=4) >= 0.8queue instance "longrun2-...@clematis.toolserver.org" dropped because itis disabledqueue instance "longrun2-...@hawthorn.toolserver.org" dropped because itis disabled(-lh_rt=57600,mem_free=890M,sql=1,sql-s7-rr=3,sqlprocs-s7=3,tmp_free=20M,user_slot=2,virtual_free=890M)cannot run globally because it offers only gc:sql-s7-rr=0.000000

As you can see the job cannot run on clematis and hawthorn, becausethese queues are disabled. queues on willow and ortelius have temporaryhigh load. wolfsbane, nightshade and yarrow are missing in this list sothe bot could start on these servers. But the last line "cannot runglobally because it offers only gc:sql-s7-rr=0.000000" shows thatresource sql-s7-rr is not available on any server at the moment. That'swhy the job is queued until s7 database is usable again.


Merlissimo

_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] SGE queues stalled

Reply via email to