Am 08.08.2014 um 03:24 schrieb Joseph Farran:

> Thanks Reuti.
> 
> I'll give that a try.   Do I need to setup an un-suspend method / script as 
> well?

No. The default behavior should still be fine.

-- Reuti


> Joseph
> 
> 
> On 8/7/2014 2:33 PM, Reuti wrote:
>> Hi,
>> 
>> Am 07.08.2014 um 21:14 schrieb Joseph Farran:
>> 
>>> I am using Son of Grid Engine 8.1.6.
>>> 
>>> We have an issue that occurs once in a while in which Grid Engine will 
>>> suspend a job ( subordinate queue ) and while Grid Engine thinks the job is 
>>> suspended ( qstat shows "S" for job state ), the process on the node keeps 
>>> running and not really suspended.
>>> 
>>> If I manually suspend the job ( qmod -sj <job-id> ), then the process 
>>> suspends just fine on the node and I see the "Ss" in qstat listing.
>>> 
>>> Is there a way to tell Grid Engine to re-issue a suspend signal to 
>>> processes on a node that are supposed to be suspended?
>>> 
>>> I can manually tell GE to suspend a job ( qmod -sj ) but then I have to 
>>> also manually un-suspend it.    So what I am looking for is to have GE 
>>> re-issue suspend signals for jobs it believes are already suspended.
>> to investigate this: what about setting up a custom "suspend_method" and log 
>> whether it's called at all and send the sigstop to the complete process 
>> group on your own to mimic the original behavior:
>> 
>> $ qconf -sq baz
>> ...
>> suspend_method /foo/bar/mysuspend.sh $job_pid
>> 
>> And the script:
>> 
>> #!/bin/sh
>> echo "suspend script called at: $(date)" >> /tmp/suspend.log
>> kill -stop -- -$1
>> 
>> 
>> -- Reuti
>> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to