Hello Reuti,

I'm picking up on our slotwise preemption problem where my colleague Dan
had left off (thanks for your input on that a few months back, by the way)
and I'm trying to see if we can use a grid engine version that does
properly implement it.  We installed SGE 6.2u5p3 and set up a few very
simple queues for testing, but when I run my tests, it still seems to
assume "one job, one slot" -- jobs using a PE and specifying multiple
slots don't get counted properly, just like we saw before.

So at this point I'm wondering, do any versions (recent or otherwise)
actually support what we're attempting?  Has anyone here gotten it
working?  I suppose it's possible I need an even older version, or a very
particular one -- that Univa document did mention 6.26u6 and u7
specifically -- but I realize I could also just be chasing something that
won't work at all :)  There's still the option of the workaround you
described; I'm just avoiding setting it up if possible, since it'd add a
fair bit of complexity to the configuration.

Thanks (again) for any thoughts!

Jesse Connell
Manager of Research Computing
College of Engineering, Boston University



>Am 30.01.2014 um 15:11 schrieb Daniel Kamalic:
>                  
>                  
>
>> Thanks for your quick response, Reuti. Uggh, that's too bad. I'm
>>running version 2011.11 . Do you think this was fixed in a newer
>>version? (I think based on your last sentence that you're saying you
>>think it was fixed.)
>
>Maybe it was in OGE only:
>
>http://www.univa.com/resources/files/Release_Notes_Univa_Grid_Engine_8.0.0
>.pdf
>
>(Second to very last sentence)
>
>
>> If not, do you have any suggested workarounds?
>
>Not really one that I'm happy with.
>
>- in the prolog of the superordinated queue, check how many slots were
>requested, (n-1) suspensions we need in addition
>- in the prolog submit (n-1) super-superordinated jobs to a dummy queue
>with zero resource requests to trigger more jobs getting suspended
>(maybe in the jobname put the job id of the original job to select them
>later easily)
>- in the epilog `qdel` the dummy jobs
>
>The super-superordinated queue will need a setting like:
>
>subordinate_list slots=4(low.q:1:sr, high.q:2:sr)
>
>We need the superordinated high.q here to limit the overall used slot
>count of active slots (replace 4 with your set value) (as we suspend
>(n-1) slots in addition, the event that any job in high.q gets suspended
>should never happen). Setting this up, it's necessary to upgrade to SoGE
>as otherwise we see one unsuspended slot being left over otherwise:
>
>https://arc.liv.ac.uk/trac/SGE/ticket/775
>
>-- Reuti
>
>
>> On 1/30/14, 6:04 AM, Reuti wrote:
>>> Hi, 
>>>
>>> Am 29.01.2014 um 20:01 schrieb Daniel Kamalic:
>>>
>>>> Slotwise preemption doesn't seem to be working correctly for single
>>>>jobs that take up multiple slots on my setup:
>>>
>>> Unfortunately that's true.
>>>
>>> I can't find a discussions about it in the mailing list though. I
>>>thought this was an issue which was fixed in the meantime.
>>>
>>> -- Reuti



_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to