Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90

2010-02-11 Thread Ralf Wildenhues
Hi Jeff,

* Jeff Squyres wrote on Wed, Feb 10, 2010 at 10:02:27PM CET:
> WHAT: Add -DOPEN_MPI=1 to the mpif77 and mpif90 command lines

> But we can put -DOPEN_MPI=1 in the argv that the wrapper adds.  This
> seems like a safe way to add it; it makes no difference whether the
> Fortran file is set to the preprocessor or not when it is compiled.

It won't work with IBM xlf which needs -WF,-D.  I'm sure there are other
Fortran compilers that don't grok -D either (and may not have any other
flag), but I'm not sure whether OpenMPI cares about them.

Cheers,
Ralf


Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 1:00 AM, Ralf Wildenhues wrote:

> * Jeff Squyres wrote on Wed, Feb 10, 2010 at 10:02:27PM CET:
> > WHAT: Add -DOPEN_MPI=1 to the mpif77 and mpif90 command lines
> 
> It won't work with IBM xlf which needs -WF,-D.  I'm sure there are other
> Fortran compilers that don't grok -D either (and may not have any other
> flag), but I'm not sure whether OpenMPI cares about them.

Ah, good!  

If we care, it is easy enough to add a configure test to figure this kind of 
stuff out.  

Are you aware of any other Fortran compilers that don't accept -D, and if so, 
what flags they *do* accept?  I would imagine a configure test that tries to 
compile a fortran program that requires some preprocessor symbol to be set and 
then tries a few different command line flags (e.g., -D, -WF,-D, ...etc.) to 
figure out which one works (if any).  Hence, having a list of possible flags to 
check would be most useful.

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Jeff Squyres (jsquyres)
Where does bcast(1) synchronize?

(Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) )

-jms
Sent from my PDA.  No type good.

- Original Message -
From: devel-boun...@open-mpi.org 
To: Open MPI Developers 
Sent: Wed Feb 10 12:50:03 2010
Subject: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL

On 10 February 2010 14:19, Jeff Squyres  wrote:
> On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote:
>
>> > If I remember correctly, the HPCC pingpong test synchronizes occasionally 
>> > by
>> > having one process send a zero-byte broadcast to all other processes.
>> >  What's a zero-byte broadcast?  Well, some MPIs apparently send no data, 
>> > but
>> > do have synchronization semantics.  (No non-root process can exit before 
>> > the
>> > root process has entered.)  Other MPIs treat the zero-byte broadcasts as
>> > no-ops;  there is no synchronization and then timing results from the HPCC
>> > pingpong test are very misleading.  So far as I can tell, the MPI standard
>> > doesn't address which behavior is correct.
>>
>> Yep... for p2p communication things are more clear (and behavior more
>> consistens in the MPI's out there) regarding zero-length messages...
>> IMHO, collectives should be non-op only in the sense that no actual
>> reduction is made because there are no elements to operate on. I mean,
>> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a
>> sync...
>
> Sorry to disagree again.  :-)
>
> The *only* MPI collective operation that guarantees a synchronization is 
> barrier.  The lack of synchronization guarantee for all other collective 
> operations is very explicit in the MPI spec.

Of course.

> Hence, it is perfectly valid for an MPI implementation to do something like a 
> no-op when no data transfer actually needs to take place
>

So you say that an MPI implementation is free to do make a sync in
case of Bcast(count=1), but not in the case of Bcast(count=0) ? I
could agree that such behavior is technically correct regarding the
MPI standard... But it makes me feel a bit uncomfortable... OK, in the
end, the change on semantic depending on message sizes is comparable
to the blocking/nonblocking one for  MPI_Send(count=10^8) versus
Send(count=1).

>
> (except, of course, the fact that Reduce(count=1) isn't defined ;-) ).
>

You likely meant Reduce(count=0) ... Good catch ;-)


PS: The following question is unrelated to this thread, but my
curiosity+laziness cannot resist... Does Open MPI has some MCA
parameter to add a synchronization at every collective call?

-- 
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Jeff Squyres (jsquyres)
I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have a 
spec handy to check if bcast(0) is defined or not (similar to reduce). If it 
is, then sure, it could sync as well. 

My previous point was that barrier is the only collective that is *required* to 
synchronize. 

-jms 
Sent from my PDA. No type good.



From: devel-boun...@open-mpi.org  
To: de...@open-mpi.org  
Sent: Thu Feb 11 07:04:59 2010
Subject: Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL 



Where does bcast(1) synchronize?

(Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) )

-jms
Sent from my PDA.  No type good.

- Original Message -
From: devel-boun...@open-mpi.org 
To: Open MPI Developers 
Sent: Wed Feb 10 12:50:03 2010
Subject: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL

On 10 February 2010 14:19, Jeff Squyres  wrote:
> On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote:
>
>> > If I remember correctly, the HPCC pingpong test synchronizes occasionally 
>> > by
>> > having one process send a zero-byte broadcast to all other processes.
>> >  What's a zero-byte broadcast?  Well, some MPIs apparently send no data, 
>> > but
>> > do have synchronization semantics.  (No non-root process can exit before 
>> > the
>> > root process has entered.)  Other MPIs treat the zero-byte broadcasts as
>> > no-ops;  there is no synchronization and then timing results from the HPCC
>> > pingpong test are very misleading.  So far as I can tell, the MPI standard
>> > doesn't address which behavior is correct.
>>
>> Yep... for p2p communication things are more clear (and behavior more
>> consistens in the MPI's out there) regarding zero-length messages...
>> IMHO, collectives should be non-op only in the sense that no actual
>> reduction is made because there are no elements to operate on. I mean,
>> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a
>> sync...
>
> Sorry to disagree again.  :-)
>
> The *only* MPI collective operation that guarantees a synchronization is 
> barrier.  The lack of synchronization guarantee for all other collective 
> operations is very explicit in the MPI spec.

Of course.

> Hence, it is perfectly valid for an MPI implementation to do something like a 
> no-op when no data transfer actually needs to take place
>

So you say that an MPI implementation is free to do make a sync in
case of Bcast(count=1), but not in the case of Bcast(count=0) ? I
could agree that such behavior is technically correct regarding the
MPI standard... But it makes me feel a bit uncomfortable... OK, in the
end, the change on semantic depending on message sizes is comparable
to the blocking/nonblocking one for  MPI_Send(count=10^8) versus
Send(count=1).

>
> (except, of course, the fact that Reduce(count=1) isn't defined ;-) ).
>

You likely meant Reduce(count=0) ... Good catch ;-)


PS: The following question is unrelated to this thread, but my
curiosity+laziness cannot resist... Does Open MPI has some MCA
parameter to add a synchronization at every collective call?

--
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] failurewithzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 7:10 AM, Jeff Squyres (jsquyres) wrote:

> I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have 
> a spec handy to check if bcast(0) is defined or not (similar to reduce). If 
> it is, then sure, it could sync as well. 

FWIW, in looking through MPI-2.2, I don't see any language in the description 
of MPI_BCAST that prevents 0-byte broadcasts.  Indeed, for the "count" 
parameter description, it distinctly says "non-negative integer", which, of 
course, includes 0.  I'm not sure why a zero-count broadcast is useful, but 
there it is.  :-)

That being said, it says "non-negative integer" for the count argument of 
MPI_REDUCE, too.  Hmm.  But I don't see getting around REDUCE's qualifying 
statement later in the text about how each process provides one or a sequence 
of elements.

Lisandro -- if you feel strongly about this, you might want to bring it up in 
the context of the Forum and ask about it.  I've provided my personal 
interpretations, but others certainly may disagree.  

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread George Bosilca

On Feb 11, 2010, at 07:10 , Jeff Squyres (jsquyres) wrote:

> I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have 
> a spec handy to check if bcast(0) is defined or not (similar to reduce). If 
> it is, then sure, it could sync as well.

I have to disagree here. There are no synchronization in MPI except 
MPI_Barrier. At best, a bcast(1) is a one way synchronization, as the only 
knowledge it gives to any rank (except root) is that the root has reached the 
bcast. No assumptions about the other ranks should be made, as this is strongly 
dependent on the underlying algorithm, and the upper level do not have a way to 
know which algorithm is used. Similarly, a reduce(1) is the opposite of the 
bcast(1), the only certain thing is at the root and it means all other ranks 
had reached the reduce(1). 

Therefore, we can argue as much as you want about what the correct arguments of 
a reduce call should be, a reduce(count=0) is one of the meaningless MPI calls 
and as such should not be tolerated.

Anyway, this discussion diverged from its original subject. The standard is 
pretty clear on what set of arguments are valid, and the fact that the send and 
receive buffers should be different is one of the strongest requirement (and 
this independent on what count is). As a courtesy, Open MPI accepts the heresy 
of a count = zero, but there is __absolutely__ no reason to stop checking the 
values of the other arguments when this is true. If the user really want to 
base the logic of his application on such a useless and non-standard statement 
(reduce(0)) at least he has to have the courtesy to provide a valid set of 
arguments.

  george.

PS: If I can suggest a correct approach to fix the python bindings I would 
encourage you to go for the strongest and more meaningful approach, sendbuf 
should always be different that recvbuf (independent on the value of count).



> My previous point was that barrier is the only collective that is *required* 
> to synchronize. 
> 
> -jms 
> Sent from my PDA. No type good.
> 
> From: devel-boun...@open-mpi.org  
> To: de...@open-mpi.org  
> Sent: Thu Feb 11 07:04:59 2010
> Subject: Re: [OMPI devel] failure 
> withzero-lengthReduce()andbothsbuf=rbuf=NULL 
> 
> Where does bcast(1) synchronize?
> 
> (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) )
> 
> -jms
> Sent from my PDA.  No type good.
> 
> - Original Message -
> From: devel-boun...@open-mpi.org 
> To: Open MPI Developers 
> Sent: Wed Feb 10 12:50:03 2010
> Subject: Re: [OMPI devel] failure with 
> zero-lengthReduce()andbothsbuf=rbuf=NULL
> 
> On 10 February 2010 14:19, Jeff Squyres  wrote:
> > On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote:
> >
> >> > If I remember correctly, the HPCC pingpong test synchronizes 
> >> > occasionally by
> >> > having one process send a zero-byte broadcast to all other processes.
> >> >  What's a zero-byte broadcast?  Well, some MPIs apparently send no data, 
> >> > but
> >> > do have synchronization semantics.  (No non-root process can exit before 
> >> > the
> >> > root process has entered.)  Other MPIs treat the zero-byte broadcasts as
> >> > no-ops;  there is no synchronization and then timing results from the 
> >> > HPCC
> >> > pingpong test are very misleading.  So far as I can tell, the MPI 
> >> > standard
> >> > doesn't address which behavior is correct.
> >>
> >> Yep... for p2p communication things are more clear (and behavior more
> >> consistens in the MPI's out there) regarding zero-length messages...
> >> IMHO, collectives should be non-op only in the sense that no actual
> >> reduction is made because there are no elements to operate on. I mean,
> >> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a
> >> sync...
> >
> > Sorry to disagree again.  :-)
> >
> > The *only* MPI collective operation that guarantees a synchronization is 
> > barrier.  The lack of synchronization guarantee for all other collective 
> > operations is very explicit in the MPI spec.
> 
> Of course.
> 
> > Hence, it is perfectly valid for an MPI implementation to do something like 
> > a no-op when no data transfer actually needs to take place
> >
> 
> So you say that an MPI implementation is free to do make a sync in
> case of Bcast(count=1), but not in the case of Bcast(count=0) ? I
> could agree that such behavior is technically correct regarding the
> MPI standard... But it makes me feel a bit uncomfortable... OK, in the
> end, the change on semantic depending on message sizes is comparable
> to the blocking/nonblocking one for  MPI_Send(count=10^8) versus
> Send(count=1).
> 
> >
> > (except, of course, the fact that Reduce(count=1) isn't defined ;-) ).
> >
> 
> You likely meant Reduce(count=0) ... Good catch ;-)
> 
> 
> PS: The following question is unrelated to this thread, but my
> curiosity+laziness cannot resist... Does Open MPI has some MCA
> parameter to add a synchronization at every collective call?
> 
> --

Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Eugene Loh




Jeff Squyres (jsquyres) wrote:

  
  Re: [OMPI devel] failure with
zero-lengthReduce()andbothsbuf=rbuf=NULL
  
I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I
don't have a spec handy to check if bcast(0) is defined or not (similar
to reduce). If it is, then sure, it could sync as well. 
  
My previous point was that barrier is the only collective that is
*required* to synchronize. 
  
  
  
  From: devel-boun...@open-mpi.org

  
  To: de...@open-mpi.org 
  
  Sent: Thu Feb 11 07:04:59 2010
  Subject: Re: [OMPI devel] failure
withzero-lengthReduce()andbothsbuf=rbuf=NULL
  
  
  

  Where does bcast(1) synchronize?
  
(Oops on typo - if reduce(1) wasn't defined, that definitely would be
bad :) )
  

To clarify my comments on this thread...

There are causal synchronizations in all collectives.  E.g., a non-root
process cannot exit a broadcast before the root process has entered. 
The root process cannot exit a reduce before the last non-root process
has entered.  Stuff like that.  Those were the only syncs I was talking
about and the only sync that the HPCC pingpong test relied on.  I
wasn't talking about full barrier sync.

Anyhow, a causal sync for a null collective is different.  There is no
data forcing synchronization.  Unlike point-to-point messages, there
isn't even header meta data.  So what behavior is required in the case
of null collectives?

Incidentally, in what respect is reduce(0) not defined?  It would seem
to me that it would be an array of 0 length, so we don't need to worry
about its datatype or contents.




[OMPI devel] Request_free() and Cancel() with REQUEST_NULL

2010-02-11 Thread Lisandro Dalcin
Why Request_free() and Cancel() do not fail when REQUEST_NULL is
passed? Am I missing something?

#include 

int main(int argc, char *argv[])
{
  MPI_Request req;
  MPI_Init(&argc, &argv);
  req = MPI_REQUEST_NULL;
  MPI_Request_free(&req);
  req = MPI_REQUEST_NULL;
  MPI_Cancel(&req);
  MPI_Finalize();
  return 0;
}


PS: The code below was tested with 1.4.1

-- 
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 10:04 AM, George Bosilca wrote:

>> I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't 
>> have a spec handy to check if bcast(0) is defined or not (similar to 
>> reduce). If it is, then sure, it could sync as well.
> 
> I have to disagree here. There are no synchronization in MPI except 
> MPI_Barrier.

There's no synchronization *guarantee* in MPI collectives except for 
MPI_BARRIER.

> At best, a bcast(1) is a one way synchronization, as the only knowledge it 
> gives to any rank (except root) is that the root has reached the bcast. No 
> assumptions about the other ranks should be made, as this is strongly 
> dependent on the underlying algorithm, and the upper level do not have a way 
> to know which algorithm is used. Similarly, a reduce(1) is the opposite of 
> the bcast(1), the only certain thing is at the root and it means all other 
> ranks had reached the reduce(1). 

I don't think we're disagreeing here.  All I'm saying is that BCAST *can* 
synchronize; I'm not saying it has to.  For example, using OMPI's sync coll 
module is perfectly legal because the MPI spec does not disallow synchronizing 
for collectives.  MPI only *requires* synchronizing for BARRIER.

Right?

> Therefore, we can argue as much as you want about what the correct arguments 
> of a reduce call should be, a reduce(count=0) is one of the meaningless MPI 
> calls and as such should not be tolerated.

No disagreement there.  I wish we could error on it.  "Darn the IMB torpedos!  
Full speed ahead!!"  ;-)

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL

2010-02-11 Thread George Bosilca
This has been corrected on the trunk 
(https://svn.open-mpi.org/trac/ompi/changeset/20537). Unfortunately, the 
corresponding patch didn't make it into the 1.4.1. I'll create a ticket to push 
it into the 1.4.2.

  george.

On Feb 11, 2010, at 10:15 , Lisandro Dalcin wrote:

> Why Request_free() and Cancel() do not fail when REQUEST_NULL is
> passed? Am I missing something?
> 
> #include 
> 
> int main(int argc, char *argv[])
> {
>  MPI_Request req;
>  MPI_Init(&argc, &argv);
>  req = MPI_REQUEST_NULL;
>  MPI_Request_free(&req);
>  req = MPI_REQUEST_NULL;
>  MPI_Cancel(&req);
>  MPI_Finalize();
>  return 0;
> }
> 
> 
> PS: The code below was tested with 1.4.1
> 
> -- 
> Lisandro Dalcin
> ---
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL

2010-02-11 Thread Jeff Squyres
...and 1.5.  :-)

On Feb 11, 2010, at 10:53 AM, George Bosilca wrote:

> This has been corrected on the trunk 
> (https://svn.open-mpi.org/trac/ompi/changeset/20537). Unfortunately, the 
> corresponding patch didn't make it into the 1.4.1. I'll create a ticket to 
> push it into the 1.4.2.
> 
>   george.
> 
> On Feb 11, 2010, at 10:15 , Lisandro Dalcin wrote:
> 
> > Why Request_free() and Cancel() do not fail when REQUEST_NULL is
> > passed? Am I missing something?
> >
> > #include 
> >
> > int main(int argc, char *argv[])
> > {
> >  MPI_Request req;
> >  MPI_Init(&argc, &argv);
> >  req = MPI_REQUEST_NULL;
> >  MPI_Request_free(&req);
> >  req = MPI_REQUEST_NULL;
> >  MPI_Cancel(&req);
> >  MPI_Finalize();
> >  return 0;
> > }
> >
> >
> > PS: The code below was tested with 1.4.1
> >
> > --
> > Lisandro Dalcin
> > ---
> > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 10:58 AM, Jeff Squyres (jsquyres) wrote:

> ...and 1.5.  :-)

Err... never mind.  It's already there.  :-)

/me slinks off into the night...

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Lisandro Dalcin
On 11 February 2010 12:04, George Bosilca  wrote:
>
> Therefore, we can argue as much as you want about what the correct arguments 
> of a reduce call should be, a reduce(count=0) is one of the meaningless MPI 
> calls and as such should not be tolerated.
>

Well, I have to disagree... I understand you (as an MPI implementor)
think that Reduce(count=0) could be meaningless and add complexity to
the implementation of MPI_Reduce()... But Reduce(count=0) could save
user code of special-casing the count==0 situation... after all,
zero-length arrays/sequences/containers do appear in actual codes...

> Anyway, this discussion diverged from its original subject. The standard is 
> pretty clear on what set of arguments are valid, and the fact that the send 
> and receive buffers should be different is one of the strongest requirement 
> (and this independent on what count is).

Sorry If count=0, why sendbuf!=recvbuff is SO STRONGLY required? I
cannot figure out the answer...

> As a courtesy, Open MPI accepts the heresy of a count = zero, but there is 
> __absolutely__ no reason to stop checking the values of the other arguments 
> when this is true. If the user really want to base the logic of his 
> application on such a useless and non-standard statement (reduce(0)) at least 
> he has to have the courtesy to provide a valid set of arguments.

I'm still not convinced that recuce(0) is non-standard, as Jeff
pointer out, the standard says "non-negative integer". The later
comment is IMHO is not saying that count=0 is invalid, such  a
conclusion is a misinterpretation. What's would be the rationale of
making Reduce(count=0) invalid, when all other
(communication+reductions) collective calls do not explicitly say that
count=0 is invalid, and "count" arguments are always described as
"non-negative integer" ??

>
> PS: If I can suggest a correct approach to fix the python bindings I would 
> encourage you to go for the strongest and more meaningful approach, sendbuf 
> should always be different that recvbuf (independent on the value of count).
>

I have the feeling that you think I'm bikeshedding because I'm lazy or
I have nothing more useful to do :-)... That's not the case... I'm the
developer of a MPI wrapper, it is not my business to impose arbitrary
restrictions on users... then I would like MPI implementations to
follow that rule... if count=0, I cannot see why I should restrict
user to pass sendbuf!=recvbuf ... moreover, in a dynamic language like
Python, things are not always obvious...

Let me show you a little Python experiment... Enter you python prompt,
and type this:

$ python
>>> from array import array
>>> a = array('i', []) # one zero-length array of integers (C-int)
>>> b = array('i', []) # other zero-length array
>>> a is b # are 'a' and 'b' the same object instance?
False
>>>

So far, so good.. we have two different arrays of integers, and their
length is zero...
Let's see the values of the (pointer, length), where the pointer is
represented as its integer value:

>>> a.buffer_info()
(0, 0)
>>> b.buffer_info()
(0, 0)
>>>

Now, suppose I do this:

>>> from mpi4py import MPI
>>> MPI.COMM_WORLD.Reduce(a, b, op=MPI.SUM, root=0)
Traceback (most recent call last):
  File "", line 1, in 
  File "Comm.pyx", line 534, in mpi4py.MPI.Comm.Reduce (src/mpi4py.MPI.c:52115)
mpi4py.MPI.Exception: MPI_ERR_ARG: invalid argument of some other kind
>>>

Then a mpi4py user mail me asking: WTF? 'a' and 'b' were different
arrays, what's going on? why my call failed? And then I have to say:
this fails because of two implementation details... Built-in Python's
'array.array' instances have pointer=NULL when lenght=0, and your MPI
implementation requires sendbuf!=recvbuf, even if count=0 and
sendbuf=recvbuf=NULL... Again, you may still thing that
Reduce(count=0), or any other (count=0) is a nonsense,
I may even agree with you... But IMHO that's not what the standard
says, but again, imposing restrictions user codes should not be our
business...

Geoge, what could I do here? Should I forcibly pass a different,fake
value enforcing sendbuff!=recvbuff myself when count=0? Would this be
portable? What if other MPI implementation in some platform decides to
complain because the fake value I'm passing does not represent a valid
address?


PS: Maintaining a MPI-2 binding for Python may requires a lot of care
and attention to little details. And I have to support, Python>=2.3
and the new Python 3; on Windows, Linux and OS X, with many of the
MPI-1 and MPI-2 implementations out there... Consistent behavior and
standard compliance on MPI implementations is FUNDAMENTAL to develop
portable wrappers for other languages... Unfortunately, things are not
so easy; mpi4py's source code and testsuite is plenty of "if OMPI" and
"if MPICH2"...



-- 
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Inv

[OMPI devel] MPI_Win_get_errhandler() and MPI_Win_set_errhandler() do not fail when passing MPI_WIN_NULL

2010-02-11 Thread Lisandro Dalcin
I've reported this long ago (alongside other issues now fixed)...

I can see that this is fixed in trunk and branches/v1.5, but not
backported to branches/v1.4

Any chance to get this for 1.4.2? Or should it wait until 1.5?


-- 
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Christian Siebert


Jeff Squyres wrote:

There's no synchronization *guarantee* in MPI collectives except for  
MPI_BARRIER. [...] BCAST *can* synchronize; I'm not saying it has to.

I fully agree with Jeff and would even go a step further.

As has already been noted, there are also some implicit data  
dependencies due to the fact that we do "message passing". This means  
that a receiver can only get a message after the sender has posted it.  
So yes, all processes get their broadcast message only after the root  
called MPI_Bcast and the like. But does this necessarily imply that  
all processes block in such a call and return only after the senders  
joined the communication? In my opinion, no correct and portable MPI  
program should rely on anything that is not explicitly stated in the  
standard.


Example to think about: I developed an MPI wrapper several years ago  
(for a slow interconnect), which almost immediately returned from  
blocking MPI calls. Instead of wasting time to wait for the senders,  
it utilized features of the virtual memory subsystem to protect the  
given message buffers from not-yet-allowed accesses (i.e., write  
access for send buffers and read access for receive buffer), and  
started the communication in the background like the nonblocking  
variants. The blocking (if at all) happened only at the time the data  
was actually accessed by the processor (so this implicit  
synchronization point we are taking about was just delayed). This  
enabled communication and computation overlap without rewriting the  
application (even for send operations or large messages due to  
pipelining) - just relink and see if it gets faster. I'm not totally  
sure that this is 100% MPI conform - but as long as programmers don't  
rely on anything that is not explicitly stated in the standard, they  
could benefit from such implementations...





[OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Ralph Castain
WHAT: Rename ompi/include/mpi_portable_platform.h to be 
opal/include/opal_portable_platform.h

WHY:   The file includes definitions and macros that identify the compiler used 
to build the system, etc.
  The contents actually have nothing specific to do with MPI.

WHEN:Weekend of Feb 20th



I'm trying to rationalize the ompi_info system so that people who build 
different layers can still get a report of the MCA params, build configuration, 
etc. for the layers they build. Thus, there would be an "orte_info" and 
"opal_info" capability. Each would report not only the info for their own 
layer, but the layer(s) below. So ompi_info remains unchanged, orte_info 
reports ORTE and OPAL info, etc.

The problem I encountered is that the referenced file is required for the 
various "info" tools, but it exists in the MPI layer. Since the file is only 
accessed at build time, I can go ahead and reference it from within "orte_info" 
and "opal_info", but it does somewhat break the abstraction barrier to do so.

Given that the info in the file has nothing to do with MPI itself, it seemed 
reasonable to move it to opal...barring arguments to the contrary.

Ralph




Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread George Bosilca
This is absolutely not true. Open MPI supports zero length collective 
operations (all of them actually), but if their arguments are correctly shaped.

What you're asking for is a free ticket to write MPI calls that do not follow 
the MPI requirements when a special value for count is given.

While zero-length arrays/sequence/containers do appears in real code, they are 
not equal to NULL. If they are NULL, that means they do not contain any useful 
data, and they don't need to be source or target of any kind of [collective or 
point-to-point] communications.

  george.

On Feb 11, 2010, at 11:53 , Lisandro Dalcin wrote:

> Well, I have to disagree... I understand you (as an MPI implementor)
> think that Reduce(count=0) could be meaningless and add complexity to
> the implementation of MPI_Reduce()... But Reduce(count=0) could save
> user code of special-casing the count==0 situation... after all,
> zero-length arrays/sequences/containers do appear in actual codes...




[OMPI devel] RFC: Processor affinity hardware thread support

2010-02-11 Thread Terry . Dontje
WHAT: Add hardware thread support to processor affinity components and 
new options to orterun.


WHY: OMPI currently does not correctly recognize processors that support 
hardware threads.  In cases where the user uses the mpirun options 
-bind-to-* and -by-* processes are bound to the first thread on each 
core.  In cases where the user specify cores to bind to in the rankfile 
those numbers are interpreted as thread ids as opposed to core ids.  
These ill side affects can lead to confusion as to which resources 
processes in a job are bound to and in the worse case a user could end 
up unknowingly oversubscribing resources.


WHERE: orte/mca/rmaps, orte/mca/odls, orte/util/hostfile, 
orte/tools/orterun/orterun.c, opal/mca/paffinity   


WHEN: 03/15/10

TIMEOUT: 02/24/10
-
The current OMPI paffinity implementation uses PLPA to set bindings of 
processes to cores or sockets.  In systems that support hardware 
threads, however, PLPA looks at a hardware thread as a core and in 
certain cases may not be able to completely map all hardware threads.  
This happens because the paffinity framework does not recognize hardware 
threads. 

I propose support such that hardware thread resources can be identified 
and have processes bound to them.  (Note: we plan on creating a new 
paffinity component using the hwloc api as opposed to extending the PLPA 
component)


Once the paffinity framework supports hardware threads I would like to 
propose the following defaults and new options that will support 
hardware threads.  I think we should first implement the "Defaults" 
section, put it back, and then start on new options and 
rankfile/hostfile fields.


Defaults:

In the case of no process binding we maintain the current rule of not 
doing anything.


When -bind-to-core or a core binding defined in the rankfile, the MPI 
process will
be bound to all hardware threads on a core (the OS will manage the 
scheduling of processes between hardware threads).  This is similar to 
the how OMPI handles scheduling of processes on core when 
-bind-to-socket option is specified to mpirun.


New Options to mpirun:

1.  -bind-to-thread - Bind processes to hardware threads, analogous to 
-bind-to-core and -bind-to-socket
2.  -threads-per-proc - Use the number of threads per process if used 
with one of the -bind-to* options
3.  -bythread - Associate processes with successive hardware threads if 
used with one of the bind-to-* options.
4.  -num-threads -  Specify the number of hardware threads per core (for 
cases where Open MPI doesn't already know this information)



New Fields to Rankfiles:

We'll be adding a third field in the slot specification of the Rankfile. 
So a rankfile entry that has 3 fields specified for a slot the last 
field is the hardware thread id.  Otherwise it is assumed that hardware 
thread scheduling is left up to the OS.

rank 0=aa slot=1:0:0-3
rank 1=bb slot=0:0
rank 2=cc slot=1-2

So in the case of rank 0 the process is bound to socket 1, core 0 and 
hardware threads 0-3.
In the case of rank 1 it is bound to socket 0, core 0 and hardware 
thread scheduling is left to the OS.
In the case of rank 2 it is bound to cores 1 and 2 hardware thread 
scheduling is left to the OS.






Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Lisandro Dalcin
On 11 February 2010 15:06, George Bosilca  wrote:
> This is absolutely not true. Open MPI supports zero length collective 
> operations (all of them actually), but if their arguments are correctly 
> shaped.
>

OK, you are right here ...

> What you're asking for is a free ticket to write MPI calls that do not follow 
> the MPI requirements when a special value for count is given.
>

But you did not answer my previous question... What's the rationale
for requiring sendbuf!=recvbuf when count=0? I would argue you want a
free ticket :-) to put restrictions on user code (without an actual
rationale) in order to simplify your implementation.

> While zero-length arrays/sequence/containers do appears in real code, they 
> are not equal to NULL. If they are NULL, that means they do not contain any 
> useful data, and they don't need to be source or target of any kind of 
> [collective or point-to-point] communications.
>

Yes, I know. Moreover, I agree with you. NULL should be reserved for
invalid pointers, not for zero-length array... The problem is that
people out there seem to disagree or just do not pay any attention to
this, thus (pointer=NULL,length=0) DO APPEAR in real life (like the
Python example I previously showed you)... Additionally, some time ago
(while discussing MPI_Alloc_mem(size=0)) we commented on the different
return values for malloc(0) depending on the platform...

Well, this discussion got too far... In the end, I agree that
representing zero-length arrays with (pointer=NULL,length=0) should be
regarded as bad practice...



-- 
Lisandro Dalcin
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



Re: [OMPI devel] MPI_Win_get_errhandler() and MPI_Win_set_errhandler()do not fail when passing MPI_WIN_NULL

2010-02-11 Thread Jeff Squyres
Yes -- it should be in 1.4.2 -- the CMR George filed after your mail earlier 
today includes both the REQUEST_NULL and WIN_*_ERRHANDLER stuff:

https://svn.open-mpi.org/trac/ompi/ticket/2257

I just added you to the CC.  

BUT: I think we should be careful with r20537; if we bring that over (and I 
think we should), we should *also* bring over r20616 because of a change I 
instigated in MPI-2.2 because of exactly this issue.

All of this stuff is already in the v1.5 branch.

Thanks for keeping after us!



On Feb 11, 2010, at 12:06 PM, Lisandro Dalcin wrote:

> I've reported this long ago (alongside other issues now fixed)...
> 
> I can see that this is fixed in trunk and branches/v1.5, but not
> backported to branches/v1.4
> 
> Any chance to get this for 1.4.2? Or should it wait until 1.5?
> 
> 
> --
> Lisandro Dalcin
> ---
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] failurewithzero-lengthReduce()andbothsbuf=rbuf=NULL

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 2:14 PM, Lisandro Dalcin wrote:

> But you did not answer my previous question... What's the rationale
> for requiring sendbuf!=recvbuf when count=0? I would argue you want a
> free ticket :-) to put restrictions on user code (without an actual
> rationale) in order to simplify your implementation.

I don't understand your assertion.  The MPI spec clearly says that sendbuf must 
!= recvbuf.  If you want the sendbuf to be the same as the recvbuf, MPI 
supports MPI_IN_PLACE for several operations.

I realize that's not what you're trying to do, but these are the semantics that 
MPI has defined.

> > While zero-length arrays/sequence/containers do appears in real code, they 
> > are not equal to NULL. If they are NULL, that means they do not contain any 
> > useful data, and they don't need to be source or target of any kind of 
> > [collective or point-to-point] communications.

And even stronger than this: remember that NULL *is* a valid pointer for MPI 
when it is paired with an appropriate datatype.  As I said in an earlier mail, 
NULL is therefore not a special case buffer for sendbuf or recvbuf.

To be absolutely clear: none of OMPI's MPI API calls have checks of the form:

if (NULL == choice_buffer)
return error;

> Yes, I know. Moreover, I agree with you. NULL should be reserved for
> invalid pointers, not for zero-length array...

But it is not.  MPI's datatype mechanism is so general that NULL is valid.

So yes, passing MPI_REDUCE(NULL, NULL, ...) violates the sendbuf!=recvbuf rule 
(partially because there is only one datatype in MPI_REDUCE).  If a language 
may convert a buffer representation to NULL for you behind the scenes, then 
it's up to the language binding to catch/correct that.

...at least by the wording in today's MPI spec.  That being said, your python 
example of buffers a and b unknowingly being transformed to NULL behind the 
scenes seems like a good thing that MPI should support better.  It's exactly 
these kinds of issues that would be helpful to know / discuss / propose 
improvements for MPI-3.

Could we convince you to come to a Forum meeting?  :-)

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Jeff Squyres
My only $0.02 is that if we rename it to opal_portable_platform.h, we must 
remember that this file is #included in mpi.h, and therefore it is installed in 
user OMPI installations.

$includedir/mpi_portable_platform.h was deemed to be a "safe" filename.  But 
we've already had a name conflict with "opal" -- so I'm not sure that 
$includedir/opal_portable_platform.h is a "safe" filename.  We might consider 
installing it in $includedir/openmpi/opal_portable_platform.h... or something 
like that (I realize that $includeddir/openmpi/ is not necessarily a good 
choice for OPAL and ORTE standalone projects; so perhaps a little creativity is 
required here...).



On Feb 11, 2010, at 12:33 PM, Ralph Castain wrote:

> WHAT: Rename ompi/include/mpi_portable_platform.h to be 
> opal/include/opal_portable_platform.h
> 
> WHY:   The file includes definitions and macros that identify the compiler 
> used to build the system, etc.
>   The contents actually have nothing specific to do with MPI.
> 
> WHEN:Weekend of Feb 20th
> 
> 
> 
> I'm trying to rationalize the ompi_info system so that people who build 
> different layers can still get a report of the MCA params, build 
> configuration, etc. for the layers they build. Thus, there would be an 
> "orte_info" and "opal_info" capability. Each would report not only the info 
> for their own layer, but the layer(s) below. So ompi_info remains unchanged, 
> orte_info reports ORTE and OPAL info, etc.
> 
> The problem I encountered is that the referenced file is required for the 
> various "info" tools, but it exists in the MPI layer. Since the file is only 
> accessed at build time, I can go ahead and reference it from within 
> "orte_info" and "opal_info", but it does somewhat break the abstraction 
> barrier to do so.
> 
> Given that the info in the file has nothing to do with MPI itself, it seemed 
> reasonable to move it to opal...barring arguments to the contrary.
> 
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Ralph Castain
I wouldn't change the installation location - just thought it would be good to 
avoid the abstraction break in the source code.

Remember - this file doesn't get installed at all unless we built the MPI 
layer...


On Feb 11, 2010, at 1:11 PM, Jeff Squyres wrote:

> My only $0.02 is that if we rename it to opal_portable_platform.h, we must 
> remember that this file is #included in mpi.h, and therefore it is installed 
> in user OMPI installations.
> 
> $includedir/mpi_portable_platform.h was deemed to be a "safe" filename.  But 
> we've already had a name conflict with "opal" -- so I'm not sure that 
> $includedir/opal_portable_platform.h is a "safe" filename.  We might consider 
> installing it in $includedir/openmpi/opal_portable_platform.h... or something 
> like that (I realize that $includeddir/openmpi/ is not necessarily a 
> good choice for OPAL and ORTE standalone projects; so perhaps a little 
> creativity is required here...).
> 
> 
> 
> On Feb 11, 2010, at 12:33 PM, Ralph Castain wrote:
> 
>> WHAT: Rename ompi/include/mpi_portable_platform.h to be 
>> opal/include/opal_portable_platform.h
>> 
>> WHY:   The file includes definitions and macros that identify the compiler 
>> used to build the system, etc.
>>  The contents actually have nothing specific to do with MPI.
>> 
>> WHEN:Weekend of Feb 20th
>> 
>> 
>> 
>> I'm trying to rationalize the ompi_info system so that people who build 
>> different layers can still get a report of the MCA params, build 
>> configuration, etc. for the layers they build. Thus, there would be an 
>> "orte_info" and "opal_info" capability. Each would report not only the info 
>> for their own layer, but the layer(s) below. So ompi_info remains unchanged, 
>> orte_info reports ORTE and OPAL info, etc.
>> 
>> The problem I encountered is that the referenced file is required for the 
>> various "info" tools, but it exists in the MPI layer. Since the file is only 
>> accessed at build time, I can go ahead and reference it from within 
>> "orte_info" and "opal_info", but it does somewhat break the abstraction 
>> barrier to do so.
>> 
>> Given that the info in the file has nothing to do with MPI itself, it seemed 
>> reasonable to move it to opal...barring arguments to the contrary.
>> 
>> Ralph
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] documenting the PMPI profiling interface

2010-02-11 Thread Eugene Loh




In the MPI standard, the portion discussing the PMPI profiling
interface says:

 3. document the implementation of different language
 bindings of the  MPI interface if
they are layered on top
 of each other, so that the profiler developer knows
 whether she must implement the profile interface for
 each binding, or can economise by implementing it
 only for the lowest level routines. 

http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313

Do we have such documentation anywhere?  I don't see this in the OMPI
FAQ.

I played with this some.  I wrote a Fortran program that called
MPI_Send.  I wrote a Fortran wrapper that intercepted MPI_Send and
called PMPI_Send.  I wrote a C wrapper that did the same thing.  It
appears that both wrappers got called.  So, it looks like we should
advise users to provide *only* C wrappers (unless they *also* want to
intercept at the Fortran level).

Yes/no?




Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 3:57 PM, Ralph Castain wrote:

> I wouldn't change the installation location - just thought it would be good 
> to avoid the abstraction break in the source code.
> 
> Remember - this file doesn't get installed at all unless we built the MPI 
> layer...

Hmm.  That becomes an interesting abstraction break in itself -- a Makefile.am 
in opal has to know if we're building / installing the MPI layer...

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] documenting the PMPI profiling interface

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 4:13 PM, Eugene Loh wrote:

> In the MPI standard, the portion discussing the PMPI profiling interface says:
> 
>  3. document the implementation of different language
>  bindings of the MPI interface if they are layered on top
>  of each other, so that the profiler developer knows
>  whether she must implement the profile interface for
>  each binding, or can economise by implementing it
>  only for the lowest level routines. 
> 
> http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313
> 
> Do we have such documentation anywhere?  I don't see this in the OMPI FAQ.
> 
> I played with this some.  I wrote a Fortran program that called MPI_Send.  I 
> wrote a Fortran wrapper that intercepted MPI_Send and called PMPI_Send.  I 
> wrote a C wrapper that did the same thing.  It appears that both wrappers got 
> called.  So, it looks like we should advise users to provide *only* C 
> wrappers (unless they *also* want to intercept at the Fortran level).
> 
> Yes/no?

Yes.  Mostly.

I believe there are a small number of exceptions to this... (/me checks...)

Ah yes, here's one: MPI_ERRHANDLER_CREATE() in Fortran does *not* call 
MPI_Errhandler_create().  Instead, it calls the back-end 
ompi_errhandler_create() function.  There's obscure reasons for this that are 
pretty uninteresting.  To be clear: if you profile this function in both C and 
Fortran and call it in Fortran, you *won't* see the corresponding C profile 
function invoked.

I don't know if there's an easy way to generate a full list of functions like 
this -- it might involve a troll through ompi/mpi/f77/*_f.c to see which ones 
call MPI_* functions for their back-end functionality vs. which ones don't.  I 
think most call MPI_* functions.

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90

2010-02-11 Thread Chris Samuel

Jeff Squyres wrote:

[about -D not working with xlf]


If we care, it is easy enough to add a configure test to
figure this kind of stuff out.  


Might be worth logging a bug with the autotools/autoconf
people on this (if it's not already there), it's been
mentioned recently on their lists as something they
should look at doing better:

http://old.nabble.com/Re:-Can't-use-Fortran-90-95-compiler-for-F77-p26209677.html

cheers,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC


Re: [OMPI devel] documenting the PMPI profiling interface

2010-02-11 Thread Eugene Loh

Jeff Squyres wrote:


On Feb 11, 2010, at 4:13 PM, Eugene Loh wrote:

 


In the MPI standard, the portion discussing the PMPI profiling interface says:

3. document the implementation of different language
bindings of the MPI interface if they are layered on top
of each other, so that the profiler developer knows
whether she must implement the profile interface for
each binding, or can economise by implementing it
only for the lowest level routines. 


http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313

Do we have such documentation anywhere?  I don't see this in the OMPI FAQ.

I played with this some.  I wrote a Fortran program that called MPI_Send.  I 
wrote a Fortran wrapper that intercepted MPI_Send and called PMPI_Send.  I 
wrote a C wrapper that did the same thing.  It appears that both wrappers got 
called.  So, it looks like we should advise users to provide *only* C wrappers 
(unless they *also* want to intercept at the Fortran level).

Yes/no?
   


Yes.  Mostly.

I believe there are a small number of exceptions to this... (/me checks...)

Ah yes, here's one: MPI_ERRHANDLER_CREATE() in Fortran does *not* call 
MPI_Errhandler_create().  Instead, it calls the back-end 
ompi_errhandler_create() function.  There's obscure reasons for this that are 
pretty uninteresting.  To be clear: if you profile this function in both C and 
Fortran and call it in Fortran, you *won't* see the corresponding C profile 
function invoked.

I don't know if there's an easy way to generate a full list of functions like 
this -- it might involve a troll through ompi/mpi/f77/*_f.c to see which ones 
call MPI_* functions for their back-end functionality vs. which ones don't.  I 
think most call MPI_* functions.
 

And I can imagine there are cases where you'd rather write the wrapper 
in the native language (e.g., Fortran) than C if handles are handled 
differently or something.


Back to the opening question:  is this documented anywhere?  (Such 
documentation *is* a requirement of the standard and OMPI is standard 
conforming, y'know.)


Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90

2010-02-11 Thread Jeff Squyres
Mm... good to know.  Thanks!

On Feb 11, 2010, at 5:58 PM, Chris Samuel wrote:

> Jeff Squyres wrote:
> 
> [about -D not working with xlf]
> 
> > If we care, it is easy enough to add a configure test to
> > figure this kind of stuff out. 
> 
> Might be worth logging a bug with the autotools/autoconf
> people on this (if it's not already there), it's been
> mentioned recently on their lists as something they
> should look at doing better:
> 
> http://old.nabble.com/Re:-Can't-use-Fortran-90-95-compiler-for-F77-p26209677.html
> 
> cheers,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] documenting the PMPI profiling interface

2010-02-11 Thread Jeff Squyres
On Feb 11, 2010, at 6:08 PM, Eugene Loh wrote:

> Back to the opening question:  is this documented anywhere?  (Such
> documentation *is* a requirement of the standard and OMPI is standard
> conforming, y'know.)

The code?  ;-)

No, I don't believe we document this stuff anywhere.

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Rainer Keller
Hi Ralph,
hmm, I don't really care about the name itselve.
As Jeff mentioned, we'd have a "abstraction break" either way.

The question I have, why does orte_info need to include the information, which 
compiler it was compiled with ;-)?

We basically only care to warn users about a typical MPI-user compilation 
mismatch (C gcc+ Fortran pgf77).

So, I would rather keep it as is...


Regards,
Rainer



On Thursday 11 February 2010 12:33:28 pm Ralph Castain wrote:
> WHAT: Rename ompi/include/mpi_portable_platform.h to be
>  opal/include/opal_portable_platform.h
> 
> WHY:   The file includes definitions and macros that identify the compiler
>  used to build the system, etc. The contents actually have nothing specific
>  to do with MPI.
> 
> WHEN:Weekend of Feb 20th
> 
> 
> 
> I'm trying to rationalize the ompi_info system so that people who build
>  different layers can still get a report of the MCA params, build
>  configuration, etc. for the layers they build. Thus, there would be an
>  "orte_info" and "opal_info" capability. Each would report not only the
>  info for their own layer, but the layer(s) below. So ompi_info remains
>  unchanged, orte_info reports ORTE and OPAL info, etc.
> 
> The problem I encountered is that the referenced file is required for the
>  various "info" tools, but it exists in the MPI layer. Since the file is
>  only accessed at build time, I can go ahead and reference it from within
>  "orte_info" and "opal_info", but it does somewhat break the abstraction
>  barrier to do so.
> 
> Given that the info in the file has nothing to do with MPI itself, it
>  seemed reasonable to move it to opal...barring arguments to the contrary.
> 
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 

-- 

Rainer Keller, PhD  Tel: +1 (865) 241-6293
Oak Ridge National Lab  Fax: +1 (865) 241-4811
PO Box 2008 MS 6164   Email: kel...@ornl.gov
Oak Ridge, TN 37831-2008AIM/Skype: rusraink



Re: [OMPI devel] documenting the PMPI profiling interface

2010-02-11 Thread Eugene Loh

Jeff Squyres wrote:


On Feb 11, 2010, at 6:08 PM, Eugene Loh wrote:
 


Back to the opening question:  is this documented anywhere?  (Such
documentation *is* a requirement of the standard and OMPI is standard
conforming, y'know.)
   


The code?  ;-)

No, I don't believe we document this stuff anywhere.
 

You lie like a dog.  Look again!  Er, best to wait for the FAQ refresh 
tonight.  And, careful about brushing up against the fresh paint.


Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h

2010-02-11 Thread Ralph Castain

On Feb 11, 2010, at 4:45 PM, Rainer Keller wrote:

> Hi Ralph,
> hmm, I don't really care about the name itselve.
> As Jeff mentioned, we'd have a "abstraction break" either way.

There is no abstraction break - I talked to Jeff about it and cleared up the 
confusion. The OMPI code will have an install line that installs the opal file 
in a to-be-determined place where mpi.h can include it. No "mpi" references 
required in OPAL.


> 
> The question I have, why does orte_info need to include the information, 
> which 
> compiler it was compiled with ;-)?

Because people create cross-compiled versions, use module files to define which 
one they are using, etc.

> 
> We basically only care to warn users about a typical MPI-user compilation 
> mismatch (C gcc+ Fortran pgf77).

Not quite correct - you need to know that it was built to cross-compile vs 
native.

HTH
Ralph

> 
> So, I would rather keep it as is...
> 
> 
> Regards,
> Rainer
> 
> 
> 
> On Thursday 11 February 2010 12:33:28 pm Ralph Castain wrote:
>> WHAT: Rename ompi/include/mpi_portable_platform.h to be
>> opal/include/opal_portable_platform.h
>> 
>> WHY:   The file includes definitions and macros that identify the compiler
>> used to build the system, etc. The contents actually have nothing specific
>> to do with MPI.
>> 
>> WHEN:Weekend of Feb 20th
>> 
>> 
>> 
>> I'm trying to rationalize the ompi_info system so that people who build
>> different layers can still get a report of the MCA params, build
>> configuration, etc. for the layers they build. Thus, there would be an
>> "orte_info" and "opal_info" capability. Each would report not only the
>> info for their own layer, but the layer(s) below. So ompi_info remains
>> unchanged, orte_info reports ORTE and OPAL info, etc.
>> 
>> The problem I encountered is that the referenced file is required for the
>> various "info" tools, but it exists in the MPI layer. Since the file is
>> only accessed at build time, I can go ahead and reference it from within
>> "orte_info" and "opal_info", but it does somewhat break the abstraction
>> barrier to do so.
>> 
>> Given that the info in the file has nothing to do with MPI itself, it
>> seemed reasonable to move it to opal...barring arguments to the contrary.
>> 
>> Ralph
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> -- 
> 
> Rainer Keller, PhD  Tel: +1 (865) 241-6293
> Oak Ridge National Lab  Fax: +1 (865) 241-4811
> PO Box 2008 MS 6164   Email: kel...@ornl.gov
> Oak Ridge, TN 37831-2008AIM/Skype: rusraink
>