Peter Körner <osm-li...@mazdermind.de> wrote:

> Since a few days I'm getting weird errors when submitting tasks.

> My Cronjob calls
> "/home/mazder/public_html/replicate-sequences/update-submit.sh"
> which conains the following command:

> qcronsub -l h_rt=0:05:00 -l virtual_free=100M -l arch=* -l
> sql-user-m=1 -N mazder-replicate-sequences -m as -o
> '/home/mazder/public_html/replicate-sequences/sge'
> /home/mazder/public_html/replicate-sequences/update-run.sh'

> Most of these calls produce the error below, which seems not
> to be an error in my code as I neither use xml nor python.

> Do you have any Idea what's going wrong?

> [...]

An educated guess: The Python errors come from the script
/sge/GE/bin/sol-amd64/qjobtest that is called as part of
qcronsub to test whether a job with that name is already
running.  qjobtest parses the output of "qstat -xml ..."
which in normal operation returns a valid XML document.  My
assumption is that when SGE is down, qstat returns the error
messages ("error: commlib error: can't connect to service
(Connection refused)", etc.) as plain text which can't be
parsed as XML which in return causes qjobtest to barf.

In short: This is another artefact of SGE being down at that
moment, you can't do anything about it, just ignore.

Tim


_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Reply via email to