A little late but I am running 8.1.7 and suspend worked part-time.
I had to write my own suspend script to make it work, specially with MATLAB jobs which try to trap signals.
Joseph On 12/19/2014 04:54 AM, [email protected] wrote:
On December 19, 2014 6:19:58 AM EST, Reuti <[email protected]> wrote: => Am 18.12.2014 um 22:21 schrieb [email protected]: => > => > We've got a job that was suspended via: => > => > qmod -sj $jobid => > => > that's continuing to run. The job consists of a BASH script, which => in => > turn submits other jobs in a loop, sleeping for 30 seconds after => each loop. => > => > When I examine the job status on the node where it is executing => via: => > ps -e f | grep $JOBID => > => > I see that the process is sleeping (state "S"), which is not => unexpected, => > given the 'sleep 30' in the loop, but not suspended (state "T"): => > => > 30559 ? SNs 0:02 | \_ /bin/bash => /var/tmp/gridengine/8.1.6/default/spool/node-5-2/job_scripts/2367998 => => Maybe it was introduced in this edition, as in 6.2u5 it's working for I can't believe I left that out... we're running SoGE 8.1.6. => me. Do you have a chance to test any other version on another machine => with your application in question? Nope. Mark => => -- Reuti => => => > Indeed, the job is not suspended, as it keeps performing the action => > inside the loop. => > => > The problem can be consistently reproduced with a trivial job, such => as: => > => > ------------------------ => > #! /bin/bash => > i=0 => > while [ $i -le 100 ] => > do => > date => > i=$((i + 1)) => > sleep 30 => > done => > ------------------------ => > => > Submitting that job to SGE, then executing 'qmod -sj $jobid' after => it => > starts does not suspend the running job. The 'qstat' command does => show => > the job as being in the 's' (suspended) state. => > => > We're not using any custom 'suspend_method' or changing the default => > signals sent by SGE. => > => > Jobs that are suspended (due to subordinated queues) by SGE have => never => > shown this behavior. => > => > Any suggestions about how to proceed with troubleshooting? => > => > Thanks, => > => > Mark => > => > => > _______________________________________________ => > users mailing list => > [email protected] => > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
