Am 07.09.2012 um 18:39 schrieb Lars van der bijl: > On 7 September 2012 17:48, Reuti <re...@staff.uni-marburg.de> wrote: >> Am 07.09.2012 um 17:45 schrieb Lars van der bijl: >> >>> On 7 September 2012 17:23, Reuti <re...@staff.uni-marburg.de> wrote: >>>>> would it be possible to change the execd to put any job that does not >>>>> exit with 0 into an error state? regardless of it being a kill -9? >>>> >>>> You can rerun the job automatically if you exit the epilog with 99. >>> >>> yes but with 137 or 139 i can't. and as the task hasn't successfully >>> finished i don't want it to start it's dependencies. i'd rather it >>> just go to a error state. >> >> You observe, that a job being rescheduled by exit 99 will trigger its >> successors by -hold_jid to start? >> > no when i'm able to raise a 99 exit status it will not trigger it's > dependencies. however a task killed because of 137 or 139 do. > and I'd rather them error out with 100 them to be removed from the > queue all together. > > i know that the grid uses 137 when you request a qdel. and this makes > it kinda hard to stop a task if anything else would be put in a 100 > error state.
No, the chain of commands is the other way round. The `qdel` will send sigkill to the job and remove it from the list of jobs in the system (whatever you do or set in the epilog doesn't matter, as the job is to be removed by the `qdel`). You can for example: - Submit all jobs with a user hold of the successor(s), this user hold you can be removed in the epilog of the predecessor if it ran successful. The name/jobid of the successor to be released could be put in a job context which you have to read in the epilog and act accordingly. - Create a special queue for some kind of `enabler' jobs which run forever (loop e.g. once a minute until they quit), the original job will create/touch a special file for which the `enabler' is waiting. If the existence of the relevant file is detected, the `enabler' can release a hold of a certain job or even just submit the successor job. - Creating a workflow can be done with: http://wildfire.bii.a-star.edu.sg/ tool GEL http://wildfire.bii.a-star.edu.sg/docs/gel_ref.pdf where you can check for files. But the jobs will be submitted during the workflow and not all in advance. Maybe it is useful anyway. -- Reuti. _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users