Am 08.08.2012 um 11:27 schrieb Lionel SPINELLI:

> Hello all,
> 
> I have a question about how the qsub command can be use to indicate to the 
> grid that the submitted script will use a specific number of CPU.
> The idea is that we are using genomic alignment tools that are software coded 
> to manage multi-threading. The user decide on how many thread the genomic 
> alignment will be launched and the software generate the required number of 
> thread to execute the computation.
> So, when the users use the qsub command to launch such software, they decide 
> the number of CPU to use. Often they use 10 CPU. Since the exec nodes have 16 
> CPU each, the grid should be informed that no more than one of those script 
> should run at a time on each node.
> Is it possible to add an option in the qsub command to indicate the grid how 
> many CPU (slots) the script will consume?
> 
> I have read plenty of doc on parallel environment and I made some tests but 
> it seems that this is not what I need (if I understand well, PE will dispatch 
> the script on multiple CPU while in my case the software itself manage its 
> multi-threading).

No. A PE will only grant permission to use certain cores to the job and these 
are recorded in the $PE_HOSTFILE. The jobscript is only started once by SGE on 
the master node of this parallel job, i.e. the first one in the $PE_HOSTFILE. 
It's the duty of the jobscript to take the granted cores into account.

So it's doing what you want AFAICS.

As you want to stay on one and the same host, the only thing which must be 
taken care of is the "allocation_rule" in the PE definition which should be set 
to "$pe_slots".

The users just have to request the proper amount of cores and use only the 
number of threads they requested (by setting e.g. OMP_NUM_THREADS for OpenMP) 
and SGE can take care of using the remaining ones for other jobs.

--

Nevertheless: some prefer just to request a complete node all the time by 
attaching an exclusive attribute to each node which the users request and 
submit only a serial job. As they get a complete node this way, they can do 
with the cores on this machine what ever they want.

--

Personally I prefer using a PE, as it's still a parallel job and should be 
handled this way.

-- Reuti


> Does anybody can tell me if there is a solution for that situation?
> 
> Thanks a lot in advance
> 
> regards
> 
> Lionel
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to