#158: BibSched erroneously thinks that tasks failed to run
-----------------------+----------------------
  Reporter:  skaplun   |      Owner:  skaplun
      Type:  defect    |     Status:  assigned
  Priority:  critical  |  Milestone:
 Component:  BibSched  |    Version:
Resolution:            |   Keywords:
-----------------------+----------------------

Comment (by simko):

 Here are logs from INSPIRE showing the task dance:

 Case study 1: (note that bibindex seems to start 10 seconds before
 bibrank actually finishes)

 {{{

 2011-04-05 14:39:12 -> StandardError: Process bibindex (task_id: 1)
 was launched but seems not to be able to reach RUNNING status.

 2011-04-05 14:35:42 --> Task #9335 (bibupload) started
 2011-04-05 14:35:56 --> Task #9335 (bibupload) exited
 2011-04-05 14:35:56 --> Task #2 (webcoll) started
 2011-04-05 14:36:40 --> Task #2 (webcoll) exited
 2011-04-05 14:36:41 --> Task #3 (bibreformat) started
 2011-04-05 14:37:36 --> Task #4 (bibrank) started
 2011-04-05 14:37:39 --> Task #3 (bibreformat) exited
 2011-04-05 14:38:47 --> Task #1 (bibindex) started
 2011-04-05 14:38:57 --> Task #4 (bibrank) exited
 2011-04-05 14:39:56 --> Task #1 (bibindex) exited
 }}}

 Case study 2:

 {{{

 2011-04-03 21:29:32 -> StandardError: Process bibreformat (task_id: 3)
 was launched but seems not to be able to reach RUNNING status.

 2011-04-03 21:24:06 --> Task #4 (bibrank) started
 2011-04-03 21:24:19 --> Task #4 (bibrank) exited
 2011-04-03 21:24:21 --> Task #1 (bibindex) started
 2011-04-03 21:24:36 --> Task #1 (bibindex) exited
 2011-04-03 21:24:36 --> Task #2 (webcoll) started
 2011-04-03 21:24:48 --> Task #2 (webcoll) exited
 2011-04-03 21:29:06 --> Task #3 (bibreformat) started
 2011-04-03 21:30:11 --> Task #3 (bibreformat) exited
 }}}

 So it looks like the tasks were executed well and that bibsched
 mis-detected the task launch trouble.

 After talking to Sam, it appears this was happening for Benoit on ADS
 too, so it is probably related to loading of big citation dictionaries
 that does not finish within 5 * `CFG_BIBSCHED_REFRESHTIME` seconds at
 times.  (=25 seconds on INSPIRE)

 Before reviving the lazy loading of citation dictionaries branch, that
 should help here, I have increased the task launch detection interval
 on INSPIRE production machine to be twice as long.  (Basically, using
 `count = 10` instead of `count = 5` in `bibsched.py`.)

-- 
Ticket URL: <http://invenio-software.org/ticket/158#comment:3>
Invenio <http://invenio-software.org>

Reply via email to