To Grid Engine Users,
Looking at the man pages for "queue_conf" for both the "prolog" and "epilog"
shows the following with regards to exit codes:
Exit codes for the prolog/epilog attribute can be interpreted based on the
following exit values:
0: Success
99: Reschedule job
100: Put job in error state
Anything else: Put queue in error state
When a job and/or job task from an array job exits with an exit code of 100, we
see something like the following from the qstat command:
job-ID prior name user state submit/start at queue
jclass slots ja-task-ID
------------------------------------------------------------------------------------------------------------------------------------------------
1009 0.54976 epi_ex100 tdhf781 Eqw 09/08/2016 13:11:42
1
What I would like to ask is if there is a way to "trap" all other exit code
values other than 0, 99 and 100 so that jobs or job tasks show up with a job
state of "Eqw" or some error state?
In the epilog script that I've setup for our jobs, I've attempted to capture
the value of the "exit_status" of a job or job task and if it isn't 0, 99 or
100, exit the epilog script with an "exit 100". However this doesn't appear
to work.
Anyway way of stating what I'm trying to convey is if the exit status a job or
job task is anything other than 0, 99 or 100 put the job in error state.
If this can be done, then we would know that a job didn't complete correctly
and if it is in Eqw state we have the option of clearing error state (i.e. qmod
-cj) and re-executing the job again.
Wayne Lee
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users