Got it. I just tried it out, and I think that might just do it.

Thanks for your help.


> Subject: Re: [gridengine users] Creating an infiniband complex
> From: [email protected]
> Date: Wed, 18 Feb 2015 15:52:19 +0100
> CC: [email protected]
> To: [email protected]
> 
> Am 18.02.2015 um 15:35 schrieb Kevin Taylor <[email protected]>:
> > 
> > 
> > I'm not sure if I'm getting the concept or not, but see if this is what you 
> > mean.
> > 
> > I have several parallel environments already to handle different software 
> > application needs. One of the CFD environments uses hpmpi for it's 
> > processing (over IB), so we have an hpmpi PE set up for it.
> > 
> > If I were that PE to something like hpmpi-ib1 and hpmpi-ib2, set up certain 
> > machines to use either hpmpi-ib1 or hpmpi-ib2, then when I submit my job:
> > 
> > qsub -q blah -pe "hpmpi-ib*" job.sh  
> > 
> > it'll just pick one?
> > 
> > Now that I re-read what I wrote, it's pretty much the same thing you and 
> > William are saying. Is there something else I forgot here?
> 
> Besides the missing slot count in the above statement: no
> 
> -- Reuti
> 
> 
> > > Subject: Re: [gridengine users] Creating an infiniband complex
> > > From: [email protected]
> > > Date: Wed, 18 Feb 2015 15:06:17 +0100
> > > CC: [email protected]
> > > To: [email protected]
> > > 
> > > Am 18.02.2015 um 13:13 schrieb Kevin Taylor <[email protected]>:
> > > > 
> > > > 
> > > > I have several groups of machines that have infiniband on them and due 
> > > > to history and physical locations, these groups of machines have 
> > > > individual infiniband domains. 
> > > > 
> > > > What I've done right now (not in production) is create a boolean 
> > > > complex for 'ib' and identify all of the nodes that contain infiniband. 
> > > > I've also created a string complex called 'ibdomain' that has a name to 
> > > > uniquely identify which systems connect to each other with IB. 
> > > > 
> > > > Is there a way that a user could just ask for 'ib' when submitting a 
> > > > parallel job (I don't care where it goes as long as it has infiniband), 
> > > > and have the grid engine tell the job the value of 'ibdomain'? Or keep 
> > > > the job within systems on the same ibdomain?
> > > 
> > > It should work to request one of the domains by specifying its name as a 
> > > request with a specific string `qsub -l ibdomain=section2 ...`. But this 
> > > may not be what you are looking for as you can't use a wildcard here (at 
> > > least not with the effect to stay inside one domain).
> > > 
> > > Instead of a complex which receives a certain number, it's easier to 
> > > define one PE per domain with a suffix and then request like `qsub -pe 
> > > "ib*" 16 ...` for any of the IB domains. Once a PE is selected, only 
> > > slots belonging to this PE will be selected for this job.
> > > 
> > > You can attach the PEs in a single queue by defining a list of PEs for 
> > > individual nodes or per hostgroup (which might shorten the line).
> > > 
> > > $ qconf -sq all.q
> > > ...
> > > pe_list make,[@ibhosts1=ib1 make],[@ibhosts2=ib2 make smp]
> > > 
> > > (the default is used only for machines not listed further more in the 
> > > list, it's not added to all automatically)
> > > 
> > > -- Reuti
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> 
                                          
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to