Hi,
Am 09.12.2016 um 08:20 schrieb John_Tai:
> I've setup PE but I'm having problems submitting jobs.
>
> - Here's the PE I created:
>
> # qconf -sp cores
> pe_name cores
> slots 999
> user_lists NONE
> xuser_lists NONE
> start_proc_args /bin/true
> stop_proc_args /bin/true
> allocation_rule $pe_slots
> control_slaves FALSE
> job_is_first_task TRUE
> urgency_slots min
> accounting_summary FALSE
> qsort_args NONE
>
> - I've then added this to all.q:
>
> qconf -aattr queue pe_list cores all.q
How many "slots" were defined in there queue definition for all.q?
-- Reuti
> - Now I submit a job:
>
> # qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock
> Your job 89 ("xclock") has been submitted
> # qstat
> job-ID prior name user state submit/start at queue
> slots ja-task-ID
> -----------------------------------------------------------------------------------------------------------------
> 89 0.00000 xclock johnt qw 12/09/2016 15:14:25
> 2
> # qalter -w p 89
> Job 89 cannot run in PE "cores" because it only offers 0 slots
> verification: no suitable queues
> # qstat -f
> queuename qtype resv/used/tot. load_avg arch
> states
> ---------------------------------------------------------------------------------
> all.q@ibm038 BIP 0/0/8 0.00 lx-amd64
>
> ############################################################################
> - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
> 89 0.55500 xclock johnt qw 12/09/2016 15:14:25 2
>
>
> ----------------------------------------------------
>
> It looks like all.q@ibm038 should have 8 free slots, so why is it only
> offering 0?
>
> Hope you can help me.
> Thanks
> John
>
>
>
>
>
>
> -----Original Message-----
> From: Reuti [mailto:[email protected]]
> Sent: Monday, December 05, 2016 6:32
> To: John_Tai
> Cc: [email protected]
> Subject: Re: [gridengine users] CPU complex
>
> Hi,
>
>> Am 05.12.2016 um 09:36 schrieb John_Tai <[email protected]>:
>>
>> Thank you so much for your reply!
>>
>>>> Will you use the consumable virtual_free here instead mem?
>>
>> Yes I meant to write virtual_free, not mem. Apologies.
>>
>>>> For parallel jobs you need to configure a (or some) so called PE (Parallel
>>>> Environment).
>>
>> My jobs are actually just one process which uses multiple cores, so for
>> example in top one process "simv" is currently using 2 cpu cores (200%).
>
> Yes, then it's a parallel job for SGE. Although the entries for
> start_proc_args resp. stop_proc_args can be left untouched to the default, a
> PE is the paradigm in SGE for a parallel job.
>
>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 3017 kelly 20 0 3353m 3.0g 165m R 200.0 0.6 15645:46 simv
>>
>> So I'm not sure PE is suitable for my case, since it is not multiple
>> parallel processes running at the same time. Am I correct?
>>
>> If so, I am trying to find a way to get SGE to keep track of the number of
>> cores used, but I believe it only keeps track of the total CPU usage in %. I
>> guess I could use this and and the <total num cores> to get the <num of
>> cores in use>, but how to integrate it in SGE?
>
> You can specify a necessary number of cores for your job in the -pe
> parameter, which can also be a range. The granted allocation by SGE you can
> check in the job script $NHOSTS, $NSLOTS, $PE_HOSTFILE.
>
> Having this setup, SGE will track the number of used cores per machine. The
> available ones you define in the queue definition. In case you have more than
> one queue per exechost, we need to setup in addition an overall limit of
> cores which can be used at the same time to avoid oversubscription.
>
> -- Reuti
>
>> Thank you again for your help.
>>
>> John
>>
>> -----Original Message-----
>> From: Reuti [mailto:[email protected]]
>> Sent: Monday, December 05, 2016 4:21
>> To: John_Tai
>> Cc: [email protected]
>> Subject: Re: [gridengine users] CPU complex
>>
>> Hi,
>>
>> Am 05.12.2016 um 08:00 schrieb John_Tai:
>>
>>> Newbie here, hope to understand SGE usage.
>>>
>>> I've successfully configured virtual_free as a complex for telling SGE how
>>> much memory is needed when submitting a job, as described here:
>>>
>>> https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.html#i1000029
>>>
>>> How do I do the same for telling SGE how many CPU cores a job needs? For
>>> example:
>>>
>>> qsub -l mem=24G,cpu=4 myjob
>>
>> Will you use the consumable virtual_free here instead mem?
>>
>>
>>> Obviously I'd need for SGE to keep track of the actual CPU utilization in
>>> the host, just as virtual_free is being tracked independently of the SGE
>>> jobs.
>>
>> For parallel jobs you need to configure a (or some) so called PE (Parallel
>> Environment). Purpose of this is, to make preparations for the parallel jobs
>> like rearranging the list of granted slots, prepare shared directories
>> between the nodes,...
>>
>> These PEs were of higher importance in former times, when parallel libraries
>> were not programmed to integrate automatically in SGE for a tight
>> integration. Your submissions could read:
>>
>> qsub -pe smp 4 myjob # allocation_rule $peslots, control_slaves true
>> qsub -pe orte 16 myjob # allovation_rule $round_robin,
>> control_slaves tue
>>
>> where smp resp. orte is the chosen parallel environment for OpenMP resp.
>> Open MPI. Its settings are explained in `man sge_pe`, the "-pe" parameter to
>> in the submission command in `man qsub`.
>>
>> -- Reuti
>> ________________________________
>>
>> This email (including its attachments, if any) may be confidential and
>> proprietary information of SMIC, and intended only for the use of the named
>> recipient(s) above. Any unauthorized use or disclosure of this email is
>> strictly prohibited. If you are not the intended recipient(s), please notify
>> the sender immediately and delete this email from your computer.
>>
>
> ________________________________
>
> This email (including its attachments, if any) may be confidential and
> proprietary information of SMIC, and intended only for the use of the named
> recipient(s) above. Any unauthorized use or disclosure of this email is
> strictly prohibited. If you are not the intended recipient(s), please notify
> the sender immediately and delete this email from your computer.
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users