[OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Tony Lam

Hi,

I understand a single orted is shared by all MPI processes from the same 
communicator on each execution host, does anyone see any problem that 
MPI/OMPI may have problem with each process has its owner orted? My 
guess it is less efficient in terms of MPI communication and memory foot 
print, but for simplification of our integration with OMPI, launching 
one orted for each MPI process is much easier to do.


I will appreciate if someone can confirm this setup will or will not work.

Thanks.

Tony



Re: [OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Ralph Castain

On May 4, 2011, at 1:51 PM, Tony Lam wrote:

> Hi,
> 
> I understand a single orted is shared by all MPI processes from the same 
> communicator on each execution host, does anyone see any problem that 
> MPI/OMPI may have problem with each process has its owner orted? My guess it 
> is less efficient in terms of MPI communication and memory foot print, but 
> for simplification of our integration with OMPI, launching one orted for each 
> MPI process is much easier to do.

The orteds won't care, but the mapper may get confused, and the MPI apps 
definitely will. Locality is based on being connected to the same orted, so the 
MPI apps will all declare themselves to be on different nodes - which means 
shared memory will be disabled. If you don't plan to use shared memory, then 
this won't matter.

You'll also see some inefficiencies in out-of-band messaging, for example when 
infiniband connections are being opened, but that's pretty minor.

> 
> I will appreciate if someone can confirm this setup will or will not work.
> 
> Thanks.
> 
> Tony
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Thomas Herault
Hi,

Could you explain why you would like one orted on top of each MPI process?
There are some situations, like resource usage limitation / accounting, that 
are possible to solve without changing the one daemon per node deployment.
Or do you enforce other kinds of restrictions on the orted process? Why 
wouldn't it be able to launch more than one MPI process / why would not that be 
desirable?

Bests,
Thomas

Le 4 mai 2011 à 15:51, Tony Lam a écrit :

> Hi,
> 
> I understand a single orted is shared by all MPI processes from the same 
> communicator on each execution host, does anyone see any problem that 
> MPI/OMPI may have problem with each process has its owner orted? My guess it 
> is less efficient in terms of MPI communication and memory foot print, but 
> for simplification of our integration with OMPI, launching one orted for each 
> MPI process is much easier to do.
> 
> I will appreciate if someone can confirm this setup will or will not work.
> 
> Thanks.
> 
> Tony
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Tony Lam

Hi Thomas,

We need to track job resource usage in our resource manager for
accounting and resource policy enforcement, sharing single orted
process in multiple jobs makes the tracking much complicated. We don't
enforce other restrictions, and I'll appreciate any suggestion on how
to resolve this or work around this.

We have thought about mapping all processes from a mpirun into a
single job to simplify job resource tracking, but that will require much 
spread changes in our software.


Thanks.

Tony


On 05/04/11 15:34, Thomas Herault wrote:

Hi,

Could you explain why you would like one orted on top of each MPI process?
There are some situations, like resource usage limitation / accounting, that 
are possible to solve without changing the one daemon per node deployment.
Or do you enforce other kinds of restrictions on the orted process? Why 
wouldn't it be able to launch more than one MPI process / why would not that be 
desirable?

Bests,
Thomas

Le 4 mai 2011 à 15:51, Tony Lam a écrit :


Hi,

I understand a single orted is shared by all MPI processes from the same 
communicator on each execution host, does anyone see any problem that MPI/OMPI 
may have problem with each process has its owner orted? My guess it is less 
efficient in terms of MPI communication and memory foot print, but for 
simplification of our integration with OMPI, launching one orted for each MPI 
process is much easier to do.

I will appreciate if someone can confirm this setup will or will not work.

Thanks.

Tony

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Ralph Castain
In that case, why not just directly launch the processes without the orted? We 
do it with slurm and even have the ability to do it with torque - so it could 
be done.

See the orte/mca/ess/slurmd component for an example of how to do so.


On May 4, 2011, at 4:55 PM, Tony Lam wrote:

> Hi Thomas,
> 
> We need to track job resource usage in our resource manager for
> accounting and resource policy enforcement, sharing single orted
> process in multiple jobs makes the tracking much complicated. We don't
> enforce other restrictions, and I'll appreciate any suggestion on how
> to resolve this or work around this.
> 
> We have thought about mapping all processes from a mpirun into a
> single job to simplify job resource tracking, but that will require much 
> spread changes in our software.
> 
> Thanks.
> 
> Tony
> 
> 
> On 05/04/11 15:34, Thomas Herault wrote:
>> Hi,
>> 
>> Could you explain why you would like one orted on top of each MPI process?
>> There are some situations, like resource usage limitation / accounting, that 
>> are possible to solve without changing the one daemon per node deployment.
>> Or do you enforce other kinds of restrictions on the orted process? Why 
>> wouldn't it be able to launch more than one MPI process / why would not that 
>> be desirable?
>> 
>> Bests,
>> Thomas
>> 
>> Le 4 mai 2011 à 15:51, Tony Lam a écrit :
>> 
>>> Hi,
>>> 
>>> I understand a single orted is shared by all MPI processes from the same 
>>> communicator on each execution host, does anyone see any problem that 
>>> MPI/OMPI may have problem with each process has its owner orted? My guess 
>>> it is less efficient in terms of MPI communication and memory foot print, 
>>> but for simplification of our integration with OMPI, launching one orted 
>>> for each MPI process is much easier to do.
>>> 
>>> I will appreciate if someone can confirm this setup will or will not work.
>>> 
>>> Thanks.
>>> 
>>> Tony
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] One orted for each process on the same host - any problem?

2011-05-04 Thread Thomas Herault
Another approach, that I've seen used, is to insert a resource manager agent 
between each Open MPI processes (be it runtime process or application process). 
Of course, it depends on how you collect your resource usage / enforce the 
resource limitation policy.

In the case I'm referring to, the agent was implemented as a unix process, and 
needed all "application" processes to be direct children of one of such agent 
process. In this case, application processes include all Open MPI runtime 
processes (orted), and all user application processes. The trick was to make 
the deployment system of ORTE launch all orteds, using the usual launching 
system of the batch scheduler, under the resource manager agent; then inserting 
another of such agents in the command line, to ensure that node orteds were 
launching one resource manager agent per application process.

That's a lot of processes, but it could work without changing a lot the code 
base, if your setup is similar, and you can launch as many resource agents per 
node as you want.

Thomas

Le 4 mai 2011 à 19:59, Ralph Castain a écrit :

> In that case, why not just directly launch the processes without the orted? 
> We do it with slurm and even have the ability to do it with torque - so it 
> could be done.
> 
> See the orte/mca/ess/slurmd component for an example of how to do so.
> 
> 
> On May 4, 2011, at 4:55 PM, Tony Lam wrote:
> 
>> Hi Thomas,
>> 
>> We need to track job resource usage in our resource manager for
>> accounting and resource policy enforcement, sharing single orted
>> process in multiple jobs makes the tracking much complicated. We don't
>> enforce other restrictions, and I'll appreciate any suggestion on how
>> to resolve this or work around this.
>> 
>> We have thought about mapping all processes from a mpirun into a
>> single job to simplify job resource tracking, but that will require much 
>> spread changes in our software.
>> 
>> Thanks.
>> 
>> Tony
>> 
>> 
>> On 05/04/11 15:34, Thomas Herault wrote:
>>> Hi,
>>> 
>>> Could you explain why you would like one orted on top of each MPI process?
>>> There are some situations, like resource usage limitation / accounting, 
>>> that are possible to solve without changing the one daemon per node 
>>> deployment.
>>> Or do you enforce other kinds of restrictions on the orted process? Why 
>>> wouldn't it be able to launch more than one MPI process / why would not 
>>> that be desirable?
>>> 
>>> Bests,
>>> Thomas
>>> 
>>> Le 4 mai 2011 à 15:51, Tony Lam a écrit :
>>> 
 Hi,
 
 I understand a single orted is shared by all MPI processes from the same 
 communicator on each execution host, does anyone see any problem that 
 MPI/OMPI may have problem with each process has its owner orted? My guess 
 it is less efficient in terms of MPI communication and memory foot print, 
 but for simplification of our integration with OMPI, launching one orted 
 for each MPI process is much easier to do.
 
 I will appreciate if someone can confirm this setup will or will not work.
 
 Thanks.
 
 Tony
 
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel