Hi Kevin,

Using wildcards in PEs does slow down the scheduler a little when it is
making a decision on which PE to use.  Reuti's suggestion works very well
though and is used by other sites with similar IB and other network
topology limitations or configurations.  The only time that things will
really slow down is if you have many different PEs that can be chosen.
This happens rarely though.

Regards,

Bill.


On Wed, Feb 18, 2015 at 9:54 AM, Kevin Taylor <[email protected]>
wrote:

>
> Got it. I just tried it out, and I think that might just do it.
>
> Thanks for your help.
>
>
> > Subject: Re: [gridengine users] Creating an infiniband complex
> > From: [email protected]
> > Date: Wed, 18 Feb 2015 15:52:19 +0100
>
> > CC: [email protected]
> > To: [email protected]
> >
> > Am 18.02.2015 um 15:35 schrieb Kevin Taylor <[email protected]>:
> > >
> > >
> > > I'm not sure if I'm getting the concept or not, but see if this is
> what you mean.
> > >
> > > I have several parallel environments already to handle different
> software application needs. One of the CFD environments uses hpmpi for it's
> processing (over IB), so we have an hpmpi PE set up for it.
> > >
> > > If I were that PE to something like hpmpi-ib1 and hpmpi-ib2, set up
> certain machines to use either hpmpi-ib1 or hpmpi-ib2, then when I submit
> my job:
> > >
> > > qsub -q blah -pe "hpmpi-ib*" job.sh
> > >
> > > it'll just pick one?
> > >
> > > Now that I re-read what I wrote, it's pretty much the same thing you
> and William are saying. Is there something else I forgot here?
> >
> > Besides the missing slot count in the above statement: no
> >
> > -- Reuti
> >
> >
> > > > Subject: Re: [gridengine users] Creating an infiniband complex
> > > > From: [email protected]
> > > > Date: Wed, 18 Feb 2015 15:06:17 +0100
> > > > CC: [email protected]
> > > > To: [email protected]
> > > >
> > > > Am 18.02.2015 um 13:13 schrieb Kevin Taylor <
> [email protected]>:
> > > > >
> > > > >
> > > > > I have several groups of machines that have infiniband on them and
> due to history and physical locations, these groups of machines have
> individual infiniband domains.
> > > > >
> > > > > What I've done right now (not in production) is create a boolean
> complex for 'ib' and identify all of the nodes that contain infiniband.
> I've also created a string complex called 'ibdomain' that has a name to
> uniquely identify which systems connect to each other with IB.
> > > > >
> > > > > Is there a way that a user could just ask for 'ib' when submitting
> a parallel job (I don't care where it goes as long as it has infiniband),
> and have the grid engine tell the job the value of 'ibdomain'? Or keep the
> job within systems on the same ibdomain?
> > > >
> > > > It should work to request one of the domains by specifying its name
> as a request with a specific string `qsub -l ibdomain=section2 ...`. But
> this may not be what you are looking for as you can't use a wildcard here
> (at least not with the effect to stay inside one domain).
> > > >
> > > > Instead of a complex which receives a certain number, it's easier to
> define one PE per domain with a suffix and then request like `qsub -pe
> "ib*" 16 ...` for any of the IB domains. Once a PE is selected, only slots
> belonging to this PE will be selected for this job.
> > > >
> > > > You can attach the PEs in a single queue by defining a list of PEs
> for individual nodes or per hostgroup (which might shorten the line).
> > > >
> > > > $ qconf -sq all.q
> > > > ...
> > > > pe_list make,[@ibhosts1=ib1 make],[@ibhosts2=ib2 make smp]
> > > >
> > > > (the default is used only for machines not listed further more in
> the list, it's not added to all automatically)
> > > >
> > > > -- Reuti
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
>
>


-- 
*William Bryce* | VP of Products
Univa Corporation <http://www.univa.com/> - 130 Esna Park Drive, Second
Floor, Markham, Ontario, Canada
*Email* [email protected] | *Mobile: 647.974.2841* | *Office: 647.478.5974*
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to