Re: [gridengine users] bsub -w "started(aJob)"

Reuti Wed, 29 Jun 2016 08:47:42 -0700

Hi,

Am 29.06.2016 um 11:28 schrieb Ueki Hikonuki:


> Hi Reuti,
> 
> I tried your example, but since I'm not familiar with AR, I couldn't achieve 
> what I wanted.
> 
> Finally I came up with the following Ruby script, which waits until a 
> specified job
> becomes RUNNING state. I call this in job-2 after submitting job-1.

while this will work, it will block a slot with an idling job until job 1 start 
- right?

-- Reuti


> Thank you
> 
> -----------------------------------------------------------------
> #!/usr/bin/env ruby
> unless ENV["SGE_ROOT"]
>  printf STDERR, "SGE_ROOT is not set\n"
> end
> $LOAD_PATH.unshift(ENV["SGE_ROOT"] + "/util/resources/drmaa4ruby")
> 
> require 'drmaa.rb'
> 
> jid = ARGV[0]
> s = DRMAA::Session.new
> 
> while true
>  ret = DRMAA::job_ps(jid)
>  break if ret == DRMAA::STATE_RUNNING
>  sleep 60
> end
> -----------------------------------------------------------------
> 
> 
> On 2016/06/21 7:56, Reuti wrote:
>> Hi,
>> 
>> Am 16.06.2016 um 08:27 schrieb Ueki Hikonuki:
>> 
>>> Hi Daniel,
>>> 
>>> Sorry for my delayed response.
>>> 
>>> OK, I understand that there is no direct support.
>> 
>> What you can do to get this with SGE: submit an advance reservation and then 
>> submit the jobs in question into this reservation:
>> 
>> $ qrsub -pe smp 2 -d 3600 -a 201606210045
>> Your advance reservation 14 has been granted
>> $ qsub -pe smp 1 -ar 14 test.sh
>> Your job 190916 ("test.sh") has been submitted
>> $ qsub -pe smp 1 -ar 14 test.sh
>> Your job 190917 ("test.sh") has been submitted
>> $ qstat -u reuti
>> job-ID  prior   name       user         state submit/start at     queue      
>>                     slots ja-task-ID
>> -----------------------------------------------------------------------------------------------------------------
>> 190916 1.55000 test.sh    reuti        r     06/21/2016 00:45:50 
>> parallel@node24                    1
>> 190917 1.05000 test.sh    reuti        r     06/21/2016 00:45:50 
>> parallel@node24                    1
>> 
>> Although a PE must be specified as the AR requested one, it can be a 
>> different one or (in your case) a different number of slots. As these slots 
>> are free, the jobs will start almost at the same time.
>> 
>> -- Reuti
>> 
>> 
>>> 
>>> Since my job-1 and job-2 don't have to start exactly at the same time
>>> (a few minutes difference can be accepted), I wanted a LSF like option.
>>> 
>>> I will look into DRMAA.
>>> 
>>> Thank you,
>>> 
>>> Ueki
>>> 
>>> On 2016/06/14 15:28, Daniel Gruber wrote:
>>>> No direct support for that in SGE.
>>>> 
>>>> When a job is released from hold (like when another starts) does not
>>>> mean it is executed. Hence you would not have not any guarantee that
>>>> both are running at the same point in time.
>>>> 
>>>> You could submit the successor before the other one and give it the job id
>>>> of the successor as context (-ac) and then in the job or a prolog you could
>>>> trigger removing the hold from the successor when a certain criteria is 
>>>> met.
>>>> 
>>>> In UGE there is a qsub -sync r … which blocks until the job is in running
>>>> state - then you could do the other job submission within a job submission
>>>> script.
>>>> 
>>>> qsub -sync r myjob.sh && qsub myotherjob.sh
>>>> 
>>>> If you really want to executed multiple jobs at the same time then a newer
>>>> enhancement (sorry, again UGE) would be
>>>> 
>>>> qsub -tcon yes -t 1-2 myjob.sh
>>>> 
>>>> That would execute myshob.sh as job array with two instances (-t 1-2)
>>>> and guarantee that both are scheduled at the same time (-tcon yes).
>>>> Then you can differentiate using the SGE_TASK_ID environment variable
>>>> which task number it is and which code is executed.
>>>> 
>>>> Most flexibility about job submission and job workflow control you have
>>>> of course by using an external API like the DRMAA API.
>>>> 
>>>> Daniel
>>>> 
>>>> 
>>>>> Am 14.06.2016 um 05:56 schrieb Hikonuki Ueki <[email protected]>:
>>>>> 
>>>>> I believe somebody has already asked this, but I couldn't find it in the 
>>>>> mailing list archive.
>>>>> 
>>>>> I am trying to figure out the equivalent option of LSF bsub -w 
>>>>> "started(aJob)" for qsub.
>>>>> 
>>>>> I easily found that bsub -w "done(aJob)" could be achieved by qsub 
>>>>> -hold_jid aJob.
>>>>> 
>>>>> I guess it is very natural that someone wants to execute a job when a 
>>>>> certain job starts to cowork with it.
>>>>> 
>>>>> What am I missing?
>>>>> 
>>>>> Thank you,
>>>>> 
>>>>> Ueki
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>>> 
>> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] bsub -w "started(aJob)"

Reply via email to