Hello,
yes true parts of the SGE environment were not running since about 4 am UTC.
The problem should be fixed afais.
Cheers
Marlen
On Thu, 2 Aug 2012, Krinkle wrote:
Date: Thu, 2 Aug 2012 09:15:56
From: Krinkle <krinklem...@gmail.com>
Reply-To: Wikimedia Toolserver <toolserver-l@lists.wikimedia.org>
To: Toolserver-l <toolserver-l@lists.wikimedia.org>
Subject: Re: [Toolserver-l] JSV stderr: File "/sge/GE/bin/sol-amd64/qjobtest"
So I've set "-m n" for now on the qcronsub entry, but it turns out (obviously)
that this doesn't help.
The error report doesn't come from SGE, qcronsub or cronie/cronietab. The error
is from the low-level cron itself because the SGE executable is somehow broken.
I don't want to disable e-mails for all cron globally since they're quite
useful and should be seldom. When they're sent it usually means a syntax error
(which is easily catched and useful to know)- or it's because stuff is broken
on a lower-level on the Toolserver - say SGE itself - which is happening right
now....
On Aug 1, 2012, at 11:42 PM, Krinkle wrote:
Hi,
Please fix this (or at least turn it off so that it doesn't emit more emails).
Assuming there is a way to turn off e-mail notifications for stuff like this
from submit.toolserver.org,
perhaps someone could include that in the recommended "example" cronietab
snippet?
Use case being the many people running things on the Toolserver that should be
"always running". And the way the documentation recommends this is done is by
using a named SGE job, and attempt to start it every minute from cronietab on
submit.toolserver.org.
When it is already running, qsub will do nothing. Otherwise it starts it. The
thing is, however. that if SGE has issues it emits an e-mail with the stack
trace - *every minute* (even if the job in question is already running fine).
I'd like to know when my bot is down and can't be started (so I can start it
manually). But I only need 1 e-mail for that. And definitely not an e-mail
every time SGE has an issue and then get a mail every minute - regardless if
whether the job in question is already running without problems.
Estimated time when the error started: 150 minutes ago
-- Krinkle
Begin forwarded message:
From: r...@toolserver.org (Cron Daemon)
Subject: Cron <krinkle@hawthorn> qcronsub -b y -N dbbot_wm -l h_rt=INFINITY -l
virtual_free=90M $HOME/bots/dbbot-wm-start.sh
Date: August 1, 2012 11:32:03 PM PDT
To: krin...@toolserver.org
error: JSV stderr: Traceback (most recent call last):
error: JSV stderr: File "/sge/GE/bin/sol-amd64/qjobtest", line 108, in <module>
error: JSV stderr: dom = minidom.parse(child_stdout)
error: JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/minidom.py", line
1915, in parse
error: JSV stderr: return expatbuilder.parse(file)
error: JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/expatbuilder.py",
line 930, in parse
error: JSV stderr: result = builder.parseFile(file)
error: JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/expatbuilder.py",
line 207, in parseFile
error: JSV stderr: parser.Parse(buffer, 0)
error: JSV stderr: xml.parsers.expat.ExpatError: syntax error: line 1, column 0
Unable to run job: JSV stderr: Traceback (most recent call last):
JSV stderr: File "/sge/GE/bin/sol-amd64/qjobtest", line 108, in <module>
JSV stderr: dom = minidom.parse(child_stdout)
JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/minidom.py", line
1915, in parse
JSV stderr: return expatbuilder.parse(file)
JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/expatbuilder.py",
line 930, in parse
JSV stderr: result = builder.parseFile(file)
JSV stderr: File
"/opt/ts/python/2.7/lib/python2.7/site-packages/_xmlplus/dom/expatbuilder.py",
line 207, in parseFile
JSV stderr: parser.Parse(buffer, 0)
JSV stderr: xml.parsers.expat.ExpatError: syntax error: line 1, column 0
JSV stderr is - xml.parsers.expat.ExpatError: syntax error: line 1, column 0.
Exiting.
_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette