Am 08.08.2014 um 03:24 schrieb Joseph Farran: > Thanks Reuti. > > I'll give that a try. Do I need to setup an un-suspend method / script as > well?
No. The default behavior should still be fine. -- Reuti > Joseph > > > On 8/7/2014 2:33 PM, Reuti wrote: >> Hi, >> >> Am 07.08.2014 um 21:14 schrieb Joseph Farran: >> >>> I am using Son of Grid Engine 8.1.6. >>> >>> We have an issue that occurs once in a while in which Grid Engine will >>> suspend a job ( subordinate queue ) and while Grid Engine thinks the job is >>> suspended ( qstat shows "S" for job state ), the process on the node keeps >>> running and not really suspended. >>> >>> If I manually suspend the job ( qmod -sj <job-id> ), then the process >>> suspends just fine on the node and I see the "Ss" in qstat listing. >>> >>> Is there a way to tell Grid Engine to re-issue a suspend signal to >>> processes on a node that are supposed to be suspended? >>> >>> I can manually tell GE to suspend a job ( qmod -sj ) but then I have to >>> also manually un-suspend it. So what I am looking for is to have GE >>> re-issue suspend signals for jobs it believes are already suspended. >> to investigate this: what about setting up a custom "suspend_method" and log >> whether it's called at all and send the sigstop to the complete process >> group on your own to mimic the original behavior: >> >> $ qconf -sq baz >> ... >> suspend_method /foo/bar/mysuspend.sh $job_pid >> >> And the script: >> >> #!/bin/sh >> echo "suspend script called at: $(date)" >> /tmp/suspend.log >> kill -stop -- -$1 >> >> >> -- Reuti >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
