Am 07.09.2012 um 18:39 schrieb Lars van der bijl:

> On 7 September 2012 17:48, Reuti <re...@staff.uni-marburg.de> wrote:
>> Am 07.09.2012 um 17:45 schrieb Lars van der bijl:
>> 
>>> On 7 September 2012 17:23, Reuti <re...@staff.uni-marburg.de> wrote:
>>>>> would it be possible to change the execd to put any job that does not
>>>>> exit with 0 into an error state? regardless of it being a kill -9?
>>>> 
>>>> You can rerun the job automatically if you exit the epilog with 99.
>>> 
>>> yes but with 137 or 139 i can't. and as the task hasn't successfully
>>> finished i don't want it to start it's dependencies. i'd rather it
>>> just go to a error state.
>> 
>> You observe, that a job being rescheduled by exit 99 will trigger its 
>> successors by -hold_jid to start?
>> 
> no when i'm able to raise a 99 exit status it will not trigger it's
> dependencies. however a task killed because of 137 or 139 do.
> and I'd rather them error out with 100 them to be removed from the
> queue all together.
> 
> i know that the grid uses 137 when you request a qdel. and this makes
> it kinda hard to stop a task if anything else would be put in a 100
> error state.

No, the chain of commands is the other way round. The `qdel` will send sigkill 
to the job and remove it from the list of jobs in the system (whatever you do 
or set in the epilog doesn't matter, as the job is to be removed by the `qdel`).

You can for example:

- Submit all jobs with a user hold of the successor(s), this user hold you can 
be removed in the epilog of the predecessor if it ran successful. The 
name/jobid of the successor to be released could be put in a job context which 
you have to read in the epilog and act accordingly.

- Create a special queue for some kind of `enabler' jobs which run forever 
(loop e.g. once a minute until they quit), the original job will create/touch a 
special file for which the `enabler' is waiting. If the existence of the 
relevant file is detected, the `enabler' can release a hold of a certain job or 
even just submit the successor job.

- Creating a workflow can be done with: http://wildfire.bii.a-star.edu.sg/ tool 
GEL http://wildfire.bii.a-star.edu.sg/docs/gel_ref.pdf where you can check for 
files. But the jobs will be submitted during the workflow and not all in 
advance. Maybe it is useful anyway.

-- Reuti.
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to