Re: [OMPI devel] [OMPI svn] svn:open-mpi r15661

2007-07-27 Thread Bert Wesarg
> Author: jsquyres
> Date: 2007-07-26 21:06:36 EDT (Thu, 26 Jul 2007)
> New Revision: 15661
> URL: https://svn.open-mpi.org/trac/ompi/changeset/15661
>
> Log:
> Passing NULL to pthread_exit() is verbotten.
Why? I can't find anything in the standard or is it some OMPI internal?

Bert




Re: [OMPI devel] [OMPI svn] svn:open-mpi r15661

2007-07-27 Thread Jeff Squyres

On Jul 27, 2007, at 6:00 AM, Bert Wesarg wrote:


Passing NULL to pthread_exit() is verbotten.
Why? I can't find anything in the standard or is it some OMPI  
internal?


The man page for pthread_exit(1) on Linux does not specifically say  
that NULL is allowed.  Plus, on RHEL4 when using the TLS glibc and  
OMPI was compiled with the pathscale compiler, if you pass NULL to  
pthread_exit(), an abort is triggered deep within glibc.  I don't  
know why this doesn't show up with other compilers, though.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Hostfiles - yet again

2007-07-27 Thread Ralph Castain



On 7/26/07 4:22 PM, "Aurelien Bouteiller"  wrote:

>> mpirun -hostfile big_pool -n 10 -host 1,2,3,4 application : -n 2 -host
>> 99,100 ft_server
> 
> This will not work: this is a way to launch MIMD jobs, that share the
> same COMM_WORLD. Not the way to launch two different applications that
> interact trough Accept/Connect.
> 
> Direct consequence on simple NAS benchmarks are:
> * if the second command does not use MPI-Init, then the first
> application locks forever in MPI-Init
> * if both use MPI init, the MPI_Comm_size of the jobs are incorrect.
> 
> 
> 
> bouteill@dancer:~$ ompi-build/debug/bin/mpirun -prefix
> /home/bouteill/ompi-build/debug/ -np 4 -host node01,node02,node03,node04
> NPB3.2-MPI/bin/lu.A.4 : -np 1 -host node01 NPB3.2-MPI/bin/mg.A.1
> 
> 
>  NAS Parallel Benchmarks 3.2 -- LU Benchmark
> 
>  Warning: program is running on  5 processors
>  but was compiled for   4
>  Size:  64x 64x 64
>  Iterations: 250
>  Number of processes: 5

Okay - of course, I can't possibly have any idea how your application
works... ;-)

However, it would be trivial to simply add two options to the app_context
command line:

1. designates that this app_context is to be launched as a separate job

2. indicates that this app_context is to be "connected" ala connect/accept
to the other app_contexts (if you want, we could even take an argument
indicating which app_contexts it is to be connected to). Or we could reverse
this as indicate we want it to be disconnected - all depends upon what
default people want to define.

This would solve the problem you describe while still allowing us to avoid
allocation confusion. I'll send it out separately as an RFC.

Thanks
Ralph

> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] [RFC] New command line options to replace persistent daemon operations

2007-07-27 Thread Ralph Castain
WHAT:   Proposal to add two new command line options that will allow us to
replace the current need to separately launch a persistent daemon to
support connect/accept operations

WHY:Remove problems of confusing multiple allocations, provide a cleaner
method for connect/accept between jobs

WHERE:  minor changes in orterun and orted, some code in rmgr and each pls
to ensure the proper jobid and connect info is passed to each
app_context as it is launched

TIMOUT: 8/10/07

We currently do not support connect/accept operations in a clean way. Users
are required to first start a persistent daemon that operates in a
user-named universe. They then must enter the mpirun command for each
application in a separate window, providing the universe name on each
command line. This is required because (a) mpirun will not run in the
background (in fact, at one point in time it would segfault, though I
believe it now just hangs), and (b) we require that all applications using
connect/accept operate under the same HNP.

This is burdensome and appears to be causing problems for users as it
requires them to remember to launch that persistent daemon first -
otherwise, the applications execute, but never connect. Additionally, we
have the problem of confused allocations from the different login sessions.
This has caused numerous problems of processes going to incorrect locations,
allocations timing out at different times and causing jobs to abort, etc.

What I propose here is to eliminate the confusion in a manner that minimizes
code complexity. The idea is to utilize our so-painfully-developed multiple
app_context capability to have the user launch all the interacting
applications with the same mpirun command. This not only eliminates the
annoyance factor for users by eliminating the need for multiple steps and
login sessions, but also solves the problem of ensuring that all
applications are running in the same allocation (so we don't have to worry
any more about timeouts in one allocation aborting another job).

The proposal is to add two command line options that are associated with a
specific app_context (feel free to redefine the name of the option - I don't
personally care):

1. --independent-job - indicates that this app_context is to be launched as
an independent job. We will assign it a separate jobid, though we will map
it as part of the overall command (e.g., if by slot and no other directives
provided, it will start mapping where the prior app_context left off)

2. --connect x,y,z  - only valid when combined with the above option,
indicates that this independent job is to be MPI-connected to app_contexts
x,y,z (where x,y,z are the number of the app_context, counting from the
beginning of the command - you choose if we start from 0 or 1).
Alternatively, we can default to connecting to everyone, and then use
--disconnect to indicate we -don't- want to be connected.

Note that this means the entire allocation for the combined app_contexts
must be provided. This helps the RTE tremendously to keep things straight,
and ensures that all the app_contexts will be able to complete (or not) in a
synchronized fashion.

It also allows us to eliminate the persistent daemon and multiple login
session requirements for connect/accept. That does not mean we cannot have a
persistent daemon to create a virtual machine, assuming we someday want to
support that mode of operation. This simply removes the requirement that the
user start one just so they can use connect/accept.

Comments?




Re: [OMPI devel] [RFC] New command line options to replace persistent daemon operations

2007-07-27 Thread Terry D. Dontje

Ralph Castain wrote:


WHAT:   Proposal to add two new command line options that will allow us to
   replace the current need to separately launch a persistent daemon to
   support connect/accept operations

WHY:Remove problems of confusing multiple allocations, provide a cleaner
   method for connect/accept between jobs

WHERE:  minor changes in orterun and orted, some code in rmgr and each pls
   to ensure the proper jobid and connect info is passed to each
   app_context as it is launched

 


It is my opinion that we would be better off attacking the issues of
the persistent daemons described below then creating a new set of
options to mpirun for process placement.  (more comments below on
the actual proposal).


TIMOUT: 8/10/07

We currently do not support connect/accept operations in a clean way. Users
are required to first start a persistent daemon that operates in a
user-named universe. They then must enter the mpirun command for each
application in a separate window, providing the universe name on each
command line. This is required because (a) mpirun will not run in the
background (in fact, at one point in time it would segfault, though I
believe it now just hangs), and (b) we require that all applications using
connect/accept operate under the same HNP.

This is burdensome and appears to be causing problems for users as it
requires them to remember to launch that persistent daemon first -
otherwise, the applications execute, but never connect. Additionally, we
have the problem of confused allocations from the different login sessions.
This has caused numerous problems of processes going to incorrect locations,
allocations timing out at different times and causing jobs to abort, etc.

What I propose here is to eliminate the confusion in a manner that minimizes
code complexity. The idea is to utilize our so-painfully-developed multiple
app_context capability to have the user launch all the interacting
applications with the same mpirun command. This not only eliminates the
annoyance factor for users by eliminating the need for multiple steps and
login sessions, but also solves the problem of ensuring that all
applications are running in the same allocation (so we don't have to worry
any more about timeouts in one allocation aborting another job).

The proposal is to add two command line options that are associated with a
specific app_context (feel free to redefine the name of the option - I don't
personally care):

1. --independent-job - indicates that this app_context is to be launched as
an independent job. We will assign it a separate jobid, though we will map
it as part of the overall command (e.g., if by slot and no other directives
provided, it will start mapping where the prior app_context left off)

 

I am unclear what does the option --connect really do?  The MPI codes 
actually
have to call MPI_Comm_connect to really connect to a process.  Can we 
get away

with just the above option?


2. --connect x,y,z  - only valid when combined with the above option,
indicates that this independent job is to be MPI-connected to app_contexts
x,y,z (where x,y,z are the number of the app_context, counting from the
beginning of the command - you choose if we start from 0 or 1).
Alternatively, we can default to connecting to everyone, and then use
--disconnect to indicate we -don't- want to be connected.

Note that this means the entire allocation for the combined app_contexts
must be provided. This helps the RTE tremendously to keep things straight,
and ensures that all the app_contexts will be able to complete (or not) in a
synchronized fashion.

It also allows us to eliminate the persistent daemon and multiple login
session requirements for connect/accept. That does not mean we cannot have a
persistent daemon to create a virtual machine, assuming we someday want to
support that mode of operation. This simply removes the requirement that the
user start one just so they can use connect/accept.

Comments?


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
 





Re: [OMPI devel] [RFC] New command line options to replace persistent daemon operations

2007-07-27 Thread Ralph Castain



On 7/27/07 7:58 AM, "Terry D. Dontje"  wrote:

> Ralph Castain wrote:
> 
>> WHAT:   Proposal to add two new command line options that will allow us to
>>replace the current need to separately launch a persistent daemon to
>>support connect/accept operations
>> 
>> WHY:Remove problems of confusing multiple allocations, provide a cleaner
>>method for connect/accept between jobs
>> 
>> WHERE:  minor changes in orterun and orted, some code in rmgr and each pls
>>to ensure the proper jobid and connect info is passed to each
>>app_context as it is launched
>> 
>>  
>> 
> It is my opinion that we would be better off attacking the issues of
> the persistent daemons described below then creating a new set of
> options to mpirun for process placement.  (more comments below on
> the actual proposal).

Non-trivial problems - we haven't figured them out in three years of
occasional effort. It isn't clear that they even -can- be solved when
considering the problem of running in multiple RM-based allocations.

I'll try to provide more detail on the problems when I return from my quick
trip...


> 
>> TIMOUT: 8/10/07
>> 
>> We currently do not support connect/accept operations in a clean way. Users
>> are required to first start a persistent daemon that operates in a
>> user-named universe. They then must enter the mpirun command for each
>> application in a separate window, providing the universe name on each
>> command line. This is required because (a) mpirun will not run in the
>> background (in fact, at one point in time it would segfault, though I
>> believe it now just hangs), and (b) we require that all applications using
>> connect/accept operate under the same HNP.
>> 
>> This is burdensome and appears to be causing problems for users as it
>> requires them to remember to launch that persistent daemon first -
>> otherwise, the applications execute, but never connect. Additionally, we
>> have the problem of confused allocations from the different login sessions.
>> This has caused numerous problems of processes going to incorrect locations,
>> allocations timing out at different times and causing jobs to abort, etc.
>> 
>> What I propose here is to eliminate the confusion in a manner that minimizes
>> code complexity. The idea is to utilize our so-painfully-developed multiple
>> app_context capability to have the user launch all the interacting
>> applications with the same mpirun command. This not only eliminates the
>> annoyance factor for users by eliminating the need for multiple steps and
>> login sessions, but also solves the problem of ensuring that all
>> applications are running in the same allocation (so we don't have to worry
>> any more about timeouts in one allocation aborting another job).
>> 
>> The proposal is to add two command line options that are associated with a
>> specific app_context (feel free to redefine the name of the option - I don't
>> personally care):
>> 
>> 1. --independent-job - indicates that this app_context is to be launched as
>> an independent job. We will assign it a separate jobid, though we will map
>> it as part of the overall command (e.g., if by slot and no other directives
>> provided, it will start mapping where the prior app_context left off)
>> 
>>  
>> 
> I am unclear what does the option --connect really do?  The MPI codes
> actually
> have to call MPI_Comm_connect to really connect to a process.  Can we
> get away
> with just the above option?


You are right - connect doesn't need to exist. I was thinking it would just
minimize the startup message as I wouldn't bother sharing RTE info across
jobs that weren't "connected". However, for MPI users, this probably would
be confusing, so I would suggest just dropping it. With the routed rml, it
won't have that much impact anyway (I think).


> 
>> 2. --connect x,y,z  - only valid when combined with the above option,
>> indicates that this independent job is to be MPI-connected to app_contexts
>> x,y,z (where x,y,z are the number of the app_context, counting from the
>> beginning of the command - you choose if we start from 0 or 1).
>> Alternatively, we can default to connecting to everyone, and then use
>> --disconnect to indicate we -don't- want to be connected.
>> 
>> Note that this means the entire allocation for the combined app_contexts
>> must be provided. This helps the RTE tremendously to keep things straight,
>> and ensures that all the app_contexts will be able to complete (or not) in a
>> synchronized fashion.
>> 
>> It also allows us to eliminate the persistent daemon and multiple login
>> session requirements for connect/accept. That does not mean we cannot have a
>> persistent daemon to create a virtual machine, assuming we someday want to
>> support that mode of operation. This simply removes the requirement that the
>> user start one just so they can use connect/accept.
>> 
>> Comments?
>> 
>> 
>> ___
>> devel m

[OMPI devel] minor bug report for building openmpi-1.2.3 on cygwin

2007-07-27 Thread Andrew Lofthouse

Hi,

I've just built and installed openmpi-1.2.3 on cygwin.  It seems that 
most files depend on opal/mca/timer/windows/timer_windows.h, but 
opal/mca/timer/windows/timer_windows_component.c depends on 
opal/timer/windows/timer_windows_component.h (which doesn't exist).  I 
simply copied timer_windows.h to timer_windows_component.h and it built 
correctly.  I haven't yet compiled any MPI applications to check correct 
operation.


Regards,

AJL


Re: [OMPI devel] Hostfiles - yet again

2007-07-27 Thread George Bosilca
It's not about the app. It's about the MPI standard. With one mpirun  
you start one MPI application (SPMD or MPMD but still only one). The  
first impact of this, is all processes started with one mpirun  
command will belong to the same MPI_COMM_WORLD.


Our mpirun is in fact equivalent to the mpiexec as defined in the MPI  
standard. Therefore, we cannot change it's behavior, outside the MPI  
2 standard boundaries.


Moreover, both of the approaches you described will only add corner  
cases, which I rather prefer to limit in number.


  george.


On Jul 27, 2007, at 8:42 AM, Ralph Castain wrote:





On 7/26/07 4:22 PM, "Aurelien Bouteiller"  wrote:

mpirun -hostfile big_pool -n 10 -host 1,2,3,4 application : -n 2 - 
host

99,100 ft_server


This will not work: this is a way to launch MIMD jobs, that share the
same COMM_WORLD. Not the way to launch two different applications  
that

interact trough Accept/Connect.

Direct consequence on simple NAS benchmarks are:
* if the second command does not use MPI-Init, then the first
application locks forever in MPI-Init
* if both use MPI init, the MPI_Comm_size of the jobs are incorrect.



bouteill@dancer:~$ ompi-build/debug/bin/mpirun -prefix
/home/bouteill/ompi-build/debug/ -np 4 -host  
node01,node02,node03,node04

NPB3.2-MPI/bin/lu.A.4 : -np 1 -host node01 NPB3.2-MPI/bin/mg.A.1


 NAS Parallel Benchmarks 3.2 -- LU Benchmark

 Warning: program is running on  5 processors
 but was compiled for   4
 Size:  64x 64x 64
 Iterations: 250
 Number of processes: 5


Okay - of course, I can't possibly have any idea how your application
works... ;-)

However, it would be trivial to simply add two options to the  
app_context

command line:

1. designates that this app_context is to be launched as a separate  
job


2. indicates that this app_context is to be "connected" ala connect/ 
accept

to the other app_contexts (if you want, we could even take an argument
indicating which app_contexts it is to be connected to). Or we  
could reverse

this as indicate we want it to be disconnected - all depends upon what
default people want to define.

This would solve the problem you describe while still allowing us  
to avoid

allocation confusion. I'll send it out separately as an RFC.

Thanks
Ralph





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [RFC] New command line options to replace persistent daemon operations

2007-07-27 Thread Aurelien Bouteiller
I basically agree with Terry, even if your proposal would solve all  
the issue I currently face. I think we need to read the MPI2 standard  
to make sure we are not on the brink of breaking the standard.


Aurelien


On Jul 27, 2007, at 10:13 , Ralph Castain wrote:





On 7/27/07 7:58 AM, "Terry D. Dontje"  wrote:


Ralph Castain wrote:

WHAT:   Proposal to add two new command line options that will  
allow us to
   replace the current need to separately launch a persistent  
daemon to

   support connect/accept operations

WHY:Remove problems of confusing multiple allocations,  
provide a cleaner

   method for connect/accept between jobs

WHERE:  minor changes in orterun and orted, some code in rmgr and  
each pls

   to ensure the proper jobid and connect info is passed to each
   app_context as it is launched




It is my opinion that we would be better off attacking the issues of
the persistent daemons described below then creating a new set of
options to mpirun for process placement.  (more comments below on
the actual proposal).


Non-trivial problems - we haven't figured them out in three years of
occasional effort. It isn't clear that they even -can- be solved when
considering the problem of running in multiple RM-based allocations.

I'll try to provide more detail on the problems when I return from  
my quick

trip...





Re: [OMPI devel] Hostfiles - yet again

2007-07-27 Thread Ralph Castain
Guess I was unclear, George - I don't know enough about Aurelien's app to
know if it is capable of (or trying to) run as one job, or not.

What has been described on this thread to-date is, in fact, a corner case.
Hence the proposal of another way to possibly address a corner case without
disrupting the normal code operation.

May not be possible, per the other more general thread


On 7/27/07 8:31 AM, "George Bosilca"  wrote:

> It's not about the app. It's about the MPI standard. With one mpirun
> you start one MPI application (SPMD or MPMD but still only one). The
> first impact of this, is all processes started with one mpirun
> command will belong to the same MPI_COMM_WORLD.
> 
> Our mpirun is in fact equivalent to the mpiexec as defined in the MPI
> standard. Therefore, we cannot change it's behavior, outside the MPI
> 2 standard boundaries.
> 
> Moreover, both of the approaches you described will only add corner
> cases, which I rather prefer to limit in number.
> 
>george.
> 
> 
> On Jul 27, 2007, at 8:42 AM, Ralph Castain wrote:
> 
>> 
>> 
>> 
>> On 7/26/07 4:22 PM, "Aurelien Bouteiller"  wrote:
>> 
 mpirun -hostfile big_pool -n 10 -host 1,2,3,4 application : -n 2 -
 host
 99,100 ft_server
>>> 
>>> This will not work: this is a way to launch MIMD jobs, that share the
>>> same COMM_WORLD. Not the way to launch two different applications
>>> that
>>> interact trough Accept/Connect.
>>> 
>>> Direct consequence on simple NAS benchmarks are:
>>> * if the second command does not use MPI-Init, then the first
>>> application locks forever in MPI-Init
>>> * if both use MPI init, the MPI_Comm_size of the jobs are incorrect.
>>> 
>>> 
>>> 
>>> bouteill@dancer:~$ ompi-build/debug/bin/mpirun -prefix
>>> /home/bouteill/ompi-build/debug/ -np 4 -host
>>> node01,node02,node03,node04
>>> NPB3.2-MPI/bin/lu.A.4 : -np 1 -host node01 NPB3.2-MPI/bin/mg.A.1
>>> 
>>> 
>>>  NAS Parallel Benchmarks 3.2 -- LU Benchmark
>>> 
>>>  Warning: program is running on  5 processors
>>>  but was compiled for   4
>>>  Size:  64x 64x 64
>>>  Iterations: 250
>>>  Number of processes: 5
>> 
>> Okay - of course, I can't possibly have any idea how your application
>> works... ;-)
>> 
>> However, it would be trivial to simply add two options to the
>> app_context
>> command line:
>> 
>> 1. designates that this app_context is to be launched as a separate
>> job
>> 
>> 2. indicates that this app_context is to be "connected" ala connect/
>> accept
>> to the other app_contexts (if you want, we could even take an argument
>> indicating which app_contexts it is to be connected to). Or we
>> could reverse
>> this as indicate we want it to be disconnected - all depends upon what
>> default people want to define.
>> 
>> This would solve the problem you describe while still allowing us
>> to avoid
>> allocation confusion. I'll send it out separately as an RFC.
>> 
>> Thanks
>> Ralph
>> 
>>> 
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Hostfiles - yet again

2007-07-27 Thread George Bosilca
You were limpid. What we're trying to say here, it's that the  
solution you described few emails ago, doesn't work. At least it  
doesn't work for what we want to do (i.e. what Aurelien described in  
his first email). We [really] need 2 separate MPI worlds, that we  
will connect at a later moment, and not one larger MPI world.


Allow me to reiterate on what we are looking for. We want to save  
some information (related to fault tolerance but this might be  
ignored here), on another MPI application. The user will start his/ 
her MPI application in exactly the same way as before plus 2 new mca  
arguments. One for enabling the message logging approach and one for  
the connect/accept port info. Once our internal framework is  
initialized in the user application, it will connect to the spare MPI  
application (let's call it storage application) (launched by the user  
on some specific nodes that have better capabilities as Aurelien  
described in his initial email). Now the user application and the  
storage one will be able to communicate via MPI, and therefore  
getting the best performance out of the available networks. Once the  
user application successfully complete, the storage application can  
disappear (or not, we will take what's available in Open MPI at that  
time).


This approach is not a corner case. It's a completely valid approach  
as described in the MPI-2 standard. However, as usual the MPI  
standard is not very clear on how to manage the connection  
information, so this is the big unknown here.


  george.

On Jul 27, 2007, at 11:08 AM, Ralph Castain wrote:

Guess I was unclear, George - I don't know enough about Aurelien's  
app to

know if it is capable of (or trying to) run as one job, or not.

What has been described on this thread to-date is, in fact, a  
corner case.
Hence the proposal of another way to possibly address a corner case  
without

disrupting the normal code operation.

May not be possible, per the other more general thread


On 7/27/07 8:31 AM, "George Bosilca"  wrote:


It's not about the app. It's about the MPI standard. With one mpirun
you start one MPI application (SPMD or MPMD but still only one). The
first impact of this, is all processes started with one mpirun
command will belong to the same MPI_COMM_WORLD.

Our mpirun is in fact equivalent to the mpiexec as defined in the MPI
standard. Therefore, we cannot change it's behavior, outside the MPI
2 standard boundaries.

Moreover, both of the approaches you described will only add corner
cases, which I rather prefer to limit in number.

   george.


On Jul 27, 2007, at 8:42 AM, Ralph Castain wrote:





On 7/26/07 4:22 PM, "Aurelien Bouteiller"   
wrote:



mpirun -hostfile big_pool -n 10 -host 1,2,3,4 application : -n 2 -
host
99,100 ft_server


This will not work: this is a way to launch MIMD jobs, that  
share the

same COMM_WORLD. Not the way to launch two different applications
that
interact trough Accept/Connect.

Direct consequence on simple NAS benchmarks are:
* if the second command does not use MPI-Init, then the first
application locks forever in MPI-Init
* if both use MPI init, the MPI_Comm_size of the jobs are  
incorrect.




bouteill@dancer:~$ ompi-build/debug/bin/mpirun -prefix
/home/bouteill/ompi-build/debug/ -np 4 -host
node01,node02,node03,node04
NPB3.2-MPI/bin/lu.A.4 : -np 1 -host node01 NPB3.2-MPI/bin/mg.A.1


 NAS Parallel Benchmarks 3.2 -- LU Benchmark

 Warning: program is running on  5 processors
 but was compiled for   4
 Size:  64x 64x 64
 Iterations: 250
 Number of processes: 5


Okay - of course, I can't possibly have any idea how your  
application

works... ;-)

However, it would be trivial to simply add two options to the
app_context
command line:

1. designates that this app_context is to be launched as a separate
job

2. indicates that this app_context is to be "connected" ala connect/
accept
to the other app_contexts (if you want, we could even take an  
argument

indicating which app_contexts it is to be connected to). Or we
could reverse
this as indicate we want it to be disconnected - all depends upon  
what

default people want to define.

This would solve the problem you describe while still allowing us
to avoid
allocation confusion. I'll send it out separately as an RFC.

Thanks
Ralph





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] COVERITY STATIC SOURCE CODE ANALYSIS

2007-07-27 Thread Jeff Squyres
It's been finalized: Coverity has formally joined the Open MPI  
Project as a Partner (OMPI web page updates will come soon).  They  
will be running the Open MPI source code base through their tools on  
a regular basis and making the results available to Members of the  
Open MPI project.


The scans will initially be the v1.2 branch and trunk nightly  
tarballs, and will likely start soon (possibly as early as next  
week).  We'll be working with Coverity to fully exploit the use of  
their tools as our familiarity/expertise grows.


A glance through a preliminary Coverity scan of the OMPI v1.2 code  
base shows three main kinds of problems:


1. corner cases not handled properly when errors occur at run time.   
It's unsurprising that these cases are buggy since these run-time  
errors probably have not occurred much in practice.


2. some false positives (or perhaps I'm just not understanding the  
results...?).


3. genuine problems/bugs.

I think the use of these tools will be a great help to hardening the  
Open MPI code base.


Woo hoo!


On Jul 19, 2007, at 9:10 PM, Jeff Squyres wrote:


Yes, we have (someone else brought it to our attention a few months
ago).  :-)

Hopefully we'll have more news on this front in the not-distant  
future.



On Jul 19, 2007, at 9:07 PM, Lisandro Dalcin wrote:


Have any of you ever consider asking OpenMPI being included here, as
it is an open source project?

http://scan.coverity.com/index.html


From many sources (mainly related to Python), it seems the results
are

impressive.

Regards,

--
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems




[OMPI devel] FW: [RFC] Sparse group implementation]

2007-07-27 Thread Mohamad Chaarawi
I Updated the RFC..

> From: Jeff Squyres 
> Date: July 25, 2007 9:04:44 AM EDT
> To: Open Developers 
> Subject: [OMPI devel] [RFC] Sparse group implementation
> Reply-To: Open MPI Developers 
>
> WHAT:Merge the sparse groups work to the trunk; get the
> community's
>   opinion on one remaining issue.
> WHY: For large MPI jobs, it can be memory-prohibitive to fully
>   represent dense groups; you can save a lot of space by
> having
>   "sparse" representations of groups that are (for example)
>   derived from MPI_COMM_WORLD.
> WHERE:   Main changes are (might have missed a few in this analysis,
>   but this is 99% of it):
>   - Big changes in ompi/group
>   - Moderate changes in ompi/comm
>   - Trivial changes in ompi/mpi/c, ompi/mca/pml/[dr|ob1],
> ompi/mca/comm/sm
> WHEN:The code is ready now in /tmp/sparse (it is passing
>   all Intel and IBM tests; see below).
> TIMEOUT: We'll merge all the work to the trunk and enable the
>   possibility of using sparse groups (dense will still be the
>   default, of course) if no one objects by COB Tuesday, 31 Aug
>   2007.
>
> ==
> ==
> ===
>
> The sparse groups work from U. Houston is ready to be brought into the

> trunk.  It is built on the premise that for very large MPI jobs, you
> don't want to fully represent MPI groups in memory if you don't have
> to.  Specifically, you can save memory for communicators/groups that
> are derived from MPI_COMM_WORLD by representing them in a sparse
> storage format.
>
> The sparse groups work introduces 3 new ompi_group_t storage formats:
>
> * dense (i.e., what it is today -- an array of ompi_proc_t pointers)
> * sparse, where the current group's contents are based on the group
>from which the child was derived:
>1. range: a series of (offset,length) tuples
>2. stride: a single (first,stride,last) tuple
>3. bitmap: a bitmap
>
> Currently, all the sparse groups code must be enabled by configuring
> with --enable-sparse-groups.  If sparse groups are enabled, each MPI
> group that is created will automatically use the storage format that
> takes the least amount of space.
>
> The Big Issue with the sparse groups is that getting a pointer to an
> ompi_proc_t may no longer be an O(1) operation -- you can't just
> access it via comm->group->procs[i].  Instead, you have to call a
> macro.  If sparse groups are enabled, this will call a function to do
> the resolution and return the proc pointer.  If sparse groups are not
> enabled, the macro currently resolves to group->procs[i].

Actually there is no macro anymore. Brian Suggested that we make it and
inline function (ompi_group_peer_lookup) that checks if sparse groups are
enabled (#if OMPI_GROUP_SPARSE) and acts accrodingly..

>
> When sparse groups are enabled, looking up a proc pointer is an
> iterative process; you have to traverse up through one or more parent
> groups until you reach a "dense" group to get the pointer.  So the
> time to lookup the proc pointer (essentially) depends on the group and

> how many times it has been derived from a parent group (there are
> corner cases where the lookup time is shorter).  Lookup times in
> MPI_COMM_WORLD are O(1) because it is dense, but it now requires an
> inline function call rather than directly accessing the data structure

> (see below).
>
> Note that the code in /tmp/sparse-groups is currently out-of-date with

> respect to the orte and opal trees due to SVN merge mistakes and
> problems.  Testing has occurred by copying full orte/opal branches
> from a trunk checkout into the sparse group tree, so we're confident
> that it's compatible with the trunk.  Full integration will occur
> before commiting to the trunk, of course.

A new branch has been created in /tmp/sparse that works perfect..

>
> The proposal we have for the community is as follows:
>
> 1. Remove the --enable-sparse-groups configure option 2. Default to
> use only dense groups (i.e., same as today) 3. If the new MCA
> parameter "mpi_use_sparse_groups" is enabled, enable
> the use of sparse groups

The configure option will be kept. we will also have a runtime option
(mpi_use_sparse_groups) that is set by default when the sparse groups are
enabled on configure.

> 4. Eliminate the current macro used for group proc lookups and instead
> use an inline function of the form:
>
> static inline ompi_proc_t lookup_group(ompi_group_t *group, int
> index) {
> if (group_is_dense(group)) {
> return group->procs[index];
> } else {
> return sparse_group_lookup(group, index);
> }
> }
>

Done, however the inline functions uses #if instead of if()..

> *** NOTE: This design adds a single "if" in some
> performance-critical paths.  If the group is sparse, it will
> add a function call and the

[OMPI devel] MPI_Win_get_group

2007-07-27 Thread Lisandro Dalcin
The MPI-2 standard says (see bottom of
)

MPI_WIN_GET_GROUP returns a duplicate of the group of the communicator
used to create the window. associated with win. The group is returned
in group.

Pease, note the 'duplicate' ...

Well, it seems OMPI (v1.2 svn) is not returning a duplicate, comparing
the handles with == C operator gives true. Can you confirm this?
Should the word 'duplicate' be interpreted as 'a new reference to' ?

As reference, MPICH2 seems to return different handles.

Anyway, I think the standard needs to be corrected/clarified. Perhaps
the strict 'duplication' does not make any sense.

Regards, and sorry me for raising again such low-level, corner-cases ...


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594