Hi,
i added this warning today (according to prior agreement with DaB.) when
a job is submitted without arch resource. This has two reasons:
# First next week the default setting will change from "solaris only" to
"all servers". This was announced in July
(<http://lists.wikimedia.org/pipermail/toolserver-l/2012-July/005110.html>)
# Secondly due to some server problems of the last two days many jobs
need a longer runtime which lead to higher load on willow. Last night
some jobs waited up to hours until willow was available again although
other servers had unused cpu and memory at the same time.
In most cases you can simply add " -l arch='*' " as argument to
qcronsub/qsub without any problems. Most scripts should run on solaris
and linux, but perhaps you should test it before to be sure. If your job
is currently only executable on solaris you must add "-l arch=sol"
before the default setting will change next week. For more information
check <https://wiki.toolserver.org/view/Job_scheduling>.
I also noticed that on user-store outage on sunday only one job was
waiting some hours because of the missing resource "fs-user-store", but
many people complained about their failed jobs. When your job needs a
special resource check if that is requestable on
<https://wiki.toolserver.org/view/Job_scheduling#Optional_resources>.
SGE will execute your job only when the requested resource is available.
If you job is already running and a needed resource is gone you can also
exit you script with code "99". This requeues your job when the resource
is available again.
@Krinkle You got the message while i was hacking the live jsv script, I
simply copied the runtime warning message and then changed it. This was
so easy that i save myself to disable jsv while rewriting.
Currently in total there is enough cpu and memory free for all user
scripts. SGE jobs are executed on five different servers and more server
could be added easily. The main problem is the load distribution because
many users do not use SGE which is bad on a shared system and leads to
overload on few servers. So please use cronie on host submit and
qsub/qcronsub to submit jobs to sge instead of running them on a special
server directly. Toolserver hardware is getting older and server may go
away suddenly because of problems. With sge you do not have to care
about it.
Merlissimo
P.S.: I want to thank DaB. for his engagement to get more money for
hardware on toolserver cluster next year. I also think this is really
needed especially for the database servers. You can follow the
discussion on
<http://meta.wikimedia.org/wiki/Talk:Wikimedia_Deutschland/2013_annual_plan_draft/de#Toolserver>.
Am 24.09.2012 18:31, schrieb Krinkle:
On Sep 24, 2012, at 6:20 PM, Platonides <platoni...@gmail.com> wrote:
On 24/09/12 18:07, Krinkle wrote:
Can someone decode this? What is this?
-- Krinkle
Begin forwarded message:
*From: *r...@toolserver.org <mailto:r...@toolserver.org> (Cron Daemon)
*Subject: **Cron <krinkle@hawthorn> qcronsub -N dbbot_wm -m n -j y -b
y -l h_rt=INFINITY -l virtual_free=90M "$HOME/bots/dbbot-wm-start.sh"*
*Date: *September 24, 2012 6:05:07 PM GMT+02:00
*To: *krin...@toolserver.org <mailto:krin...@toolserver.org>
warning: Please add maximum runtime by adding parameter [33m-l
arch=[0msol|lx
The text asks you to place a time limit. The parameter (embedded in
posix colors despite not being output to a terminal) to specify if it
needs a linux or solaris server.
However, if I try to execute it, I get a much saner message:
$ qcronsub -N dbbot_wm -m n -j y -b y -l h_rt=INFINITY -l
virtual_free=90M "/home/krinkle/bots/dbbot-wm-start.sh"
Unable to run job: Script not executable: /home/krinkle/bots/dbbot-wm-start.sh.
Exiting.
warning: Please add the os this job can run on by adding parameter -l
arch='*'|sol|lx
For more information read documentation at
https://wiki.toolserver.org/view/Job_scheduling
As this is a php script, your parameter would be «-l arch='*'»
Yes, I've added `-l arch='*'` to it already a minute ago.
Warnings are gone, not sure why it nagged about maximum runtime, it already has
INFINITY.
I'm not sure why arch=x isn't the default though, or maybe it is but outputs
the warning anyway?
A warning like that may be useful, but do consider that cronie from submit will
send e-mails for it.
-- Krinkle
_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette
_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette