Am 13.06.2012 um 02:10 schrieb Joseph Farran:

> Well, for our needs, we *REALLY* need Parallel Job suspension.    It's not 
> even a choice for us.
> 
> If Torque/Maui can do it, I am sure OGE can do it without issues.
> 
> Can someone please tell me what patch I need to install to un-break / turn-on 
> Parallel job suspension?
> 
> If you guys are that paranoid about PE suspension, how about adding an on/off 
> flag for this since the code is already there and let the admin pick.

Yep, but there is also the case that a slave gets suspended and you have to 
distribute it back to the master process of the parallel job. Therefore I had 
the idea to set:

qmaster_params SUSPEND_PARALLEL_GROUP=yes

in the RFE. But now I wonder whether it would be better to be put it in the PE 
definition, as some parallel libraries might not like it. This is related to 
another RFE, whether a job is eligible for suspension or not.

https://arc.liv.ac.uk/trac/SGE/ticket/735

-- Reuti


> Joseph
> 
> 
> On 06/12/2012 06:52 AM, Dave Love wrote:
>> "Joseph A. Farran"<[email protected]>  writes:
>> 
>>> If you guys are taking requests, *please* add suspension and ignore old Sun 
>>> recommendation.
>> Support for suspension exists, it's just broken (per the issue Reuti
>> pointed to).  The use of | is clearly wrong, but the other bit isn't
>> clear.  It's one of the available patches I wanted to understand before
>> applying (and had forgotten about).  Can anyone cast more light on it?
>> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to