Hi,
A user that has hi QOS submits a job but then his job gets to idle
state. There are 11 procs available, and some 20 other jobs in the Q
state in lower priotiry, however the job (id 191803) does not start.
It can take very long time until the job starts - even more than an
hour. I think that it only starts when a running job has ended, and
then the hi QOS jobs finally gets into R status. But I'm having some
troubles confirming this theory.
Question is: There are 11 procs available, why doesn't the job starts
immediatly? It only needs one proc., and there are 11 free procs, but
check job says 'insufficient idle procs available 0 < 1'  .

Some details:


while using the checkjob we get:

//================//
 checking job 191803
 State: Idle
 ...
 ...
 Flags:       RESTARTABLE
 PE:  1.00  StartPriority:  1004
 job cannot run in partition DEFAULT (insufficient idle procs available: 0 < 1)
//================//



Using showq shows that are plenty of procs available:

//================//
  showq
  59 Active Jobs      77 of   88 Processors Active (87.50%)
//================//


Using showres shows as if all running jobs are reserved:

//================//
Showres
...
...
59 reservations located
//================//


Using showbf -A shows that there are no backfills.

//================//
showbf -A
backfill window (user: '[ALL]' group: '[ALL]' partition: ALL) Mon Jan
21 23:04:45

no procs available
//================//


And finally using diagnose -p shows
[EMAIL PROTECTED] ~]$ diagnose -p that this jobs has a higher priority
than other jobs in idle mode:

//================//
diagnose -p
diagnosing job priority information (partition: ALL)

Job                    PRIORITY*   Cred(Group:  QOS)  Serv(QTime)
             Weights   --------       1(    1:    2)     1(    1)

191802                     1008    99.3(1000.:  0.0)   0.7(  7.5)
191772                      431     0.0(  0.0:  0.0) 100.0(430.6)
...
...
...
191799                      289     0.0(  0.0:  0.0) 100.0(289.4)
191800                      289     0.0(  0.0:  0.0) 100.0(289.4)

Percent Contribution   --------    21.5( 21.5:  0.0)  78.5( 78.5)
//================//


Question is: There are 11 procs available, why doesn't the job starts
immediatly? It only needs one proc., and there are 11 free procs, but
check job says 'insufficient idle procs available 0 < 1'  .

Thanks,
Itay.
_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to