Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90
Hi Jeff, * Jeff Squyres wrote on Wed, Feb 10, 2010 at 10:02:27PM CET: > WHAT: Add -DOPEN_MPI=1 to the mpif77 and mpif90 command lines > But we can put -DOPEN_MPI=1 in the argv that the wrapper adds. This > seems like a safe way to add it; it makes no difference whether the > Fortran file is set to the preprocessor or not when it is compiled. It won't work with IBM xlf which needs -WF,-D. I'm sure there are other Fortran compilers that don't grok -D either (and may not have any other flag), but I'm not sure whether OpenMPI cares about them. Cheers, Ralf
Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90
On Feb 11, 2010, at 1:00 AM, Ralf Wildenhues wrote: > * Jeff Squyres wrote on Wed, Feb 10, 2010 at 10:02:27PM CET: > > WHAT: Add -DOPEN_MPI=1 to the mpif77 and mpif90 command lines > > It won't work with IBM xlf which needs -WF,-D. I'm sure there are other > Fortran compilers that don't grok -D either (and may not have any other > flag), but I'm not sure whether OpenMPI cares about them. Ah, good! If we care, it is easy enough to add a configure test to figure this kind of stuff out. Are you aware of any other Fortran compilers that don't accept -D, and if so, what flags they *do* accept? I would imagine a configure test that tries to compile a fortran program that requires some preprocessor symbol to be set and then tries a few different command line flags (e.g., -D, -WF,-D, ...etc.) to figure out which one works (if any). Hence, having a list of possible flags to check would be most useful. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL
Where does bcast(1) synchronize? (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) ) -jms Sent from my PDA. No type good. - Original Message - From: devel-boun...@open-mpi.org To: Open MPI Developers Sent: Wed Feb 10 12:50:03 2010 Subject: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL On 10 February 2010 14:19, Jeff Squyres wrote: > On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote: > >> > If I remember correctly, the HPCC pingpong test synchronizes occasionally >> > by >> > having one process send a zero-byte broadcast to all other processes. >> > What's a zero-byte broadcast? Well, some MPIs apparently send no data, >> > but >> > do have synchronization semantics. (No non-root process can exit before >> > the >> > root process has entered.) Other MPIs treat the zero-byte broadcasts as >> > no-ops; there is no synchronization and then timing results from the HPCC >> > pingpong test are very misleading. So far as I can tell, the MPI standard >> > doesn't address which behavior is correct. >> >> Yep... for p2p communication things are more clear (and behavior more >> consistens in the MPI's out there) regarding zero-length messages... >> IMHO, collectives should be non-op only in the sense that no actual >> reduction is made because there are no elements to operate on. I mean, >> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a >> sync... > > Sorry to disagree again. :-) > > The *only* MPI collective operation that guarantees a synchronization is > barrier. The lack of synchronization guarantee for all other collective > operations is very explicit in the MPI spec. Of course. > Hence, it is perfectly valid for an MPI implementation to do something like a > no-op when no data transfer actually needs to take place > So you say that an MPI implementation is free to do make a sync in case of Bcast(count=1), but not in the case of Bcast(count=0) ? I could agree that such behavior is technically correct regarding the MPI standard... But it makes me feel a bit uncomfortable... OK, in the end, the change on semantic depending on message sizes is comparable to the blocking/nonblocking one for MPI_Send(count=10^8) versus Send(count=1). > > (except, of course, the fact that Reduce(count=1) isn't defined ;-) ). > You likely meant Reduce(count=0) ... Good catch ;-) PS: The following question is unrelated to this thread, but my curiosity+laziness cannot resist... Does Open MPI has some MCA parameter to add a synchronization at every collective call? -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have a spec handy to check if bcast(0) is defined or not (similar to reduce). If it is, then sure, it could sync as well. My previous point was that barrier is the only collective that is *required* to synchronize. -jms Sent from my PDA. No type good. From: devel-boun...@open-mpi.org To: de...@open-mpi.org Sent: Thu Feb 11 07:04:59 2010 Subject: Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL Where does bcast(1) synchronize? (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) ) -jms Sent from my PDA. No type good. - Original Message - From: devel-boun...@open-mpi.org To: Open MPI Developers Sent: Wed Feb 10 12:50:03 2010 Subject: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL On 10 February 2010 14:19, Jeff Squyres wrote: > On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote: > >> > If I remember correctly, the HPCC pingpong test synchronizes occasionally >> > by >> > having one process send a zero-byte broadcast to all other processes. >> > What's a zero-byte broadcast? Well, some MPIs apparently send no data, >> > but >> > do have synchronization semantics. (No non-root process can exit before >> > the >> > root process has entered.) Other MPIs treat the zero-byte broadcasts as >> > no-ops; there is no synchronization and then timing results from the HPCC >> > pingpong test are very misleading. So far as I can tell, the MPI standard >> > doesn't address which behavior is correct. >> >> Yep... for p2p communication things are more clear (and behavior more >> consistens in the MPI's out there) regarding zero-length messages... >> IMHO, collectives should be non-op only in the sense that no actual >> reduction is made because there are no elements to operate on. I mean, >> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a >> sync... > > Sorry to disagree again. :-) > > The *only* MPI collective operation that guarantees a synchronization is > barrier. The lack of synchronization guarantee for all other collective > operations is very explicit in the MPI spec. Of course. > Hence, it is perfectly valid for an MPI implementation to do something like a > no-op when no data transfer actually needs to take place > So you say that an MPI implementation is free to do make a sync in case of Bcast(count=1), but not in the case of Bcast(count=0) ? I could agree that such behavior is technically correct regarding the MPI standard... But it makes me feel a bit uncomfortable... OK, in the end, the change on semantic depending on message sizes is comparable to the blocking/nonblocking one for MPI_Send(count=10^8) versus Send(count=1). > > (except, of course, the fact that Reduce(count=1) isn't defined ;-) ). > You likely meant Reduce(count=0) ... Good catch ;-) PS: The following question is unrelated to this thread, but my curiosity+laziness cannot resist... Does Open MPI has some MCA parameter to add a synchronization at every collective call? -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] failurewithzero-lengthReduce()andbothsbuf=rbuf=NULL
On Feb 11, 2010, at 7:10 AM, Jeff Squyres (jsquyres) wrote: > I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have > a spec handy to check if bcast(0) is defined or not (similar to reduce). If > it is, then sure, it could sync as well. FWIW, in looking through MPI-2.2, I don't see any language in the description of MPI_BCAST that prevents 0-byte broadcasts. Indeed, for the "count" parameter description, it distinctly says "non-negative integer", which, of course, includes 0. I'm not sure why a zero-count broadcast is useful, but there it is. :-) That being said, it says "non-negative integer" for the count argument of MPI_REDUCE, too. Hmm. But I don't see getting around REDUCE's qualifying statement later in the text about how each process provides one or a sequence of elements. Lisandro -- if you feel strongly about this, you might want to bring it up in the context of the Forum and ask about it. I've provided my personal interpretations, but others certainly may disagree. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
On Feb 11, 2010, at 07:10 , Jeff Squyres (jsquyres) wrote: > I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have > a spec handy to check if bcast(0) is defined or not (similar to reduce). If > it is, then sure, it could sync as well. I have to disagree here. There are no synchronization in MPI except MPI_Barrier. At best, a bcast(1) is a one way synchronization, as the only knowledge it gives to any rank (except root) is that the root has reached the bcast. No assumptions about the other ranks should be made, as this is strongly dependent on the underlying algorithm, and the upper level do not have a way to know which algorithm is used. Similarly, a reduce(1) is the opposite of the bcast(1), the only certain thing is at the root and it means all other ranks had reached the reduce(1). Therefore, we can argue as much as you want about what the correct arguments of a reduce call should be, a reduce(count=0) is one of the meaningless MPI calls and as such should not be tolerated. Anyway, this discussion diverged from its original subject. The standard is pretty clear on what set of arguments are valid, and the fact that the send and receive buffers should be different is one of the strongest requirement (and this independent on what count is). As a courtesy, Open MPI accepts the heresy of a count = zero, but there is __absolutely__ no reason to stop checking the values of the other arguments when this is true. If the user really want to base the logic of his application on such a useless and non-standard statement (reduce(0)) at least he has to have the courtesy to provide a valid set of arguments. george. PS: If I can suggest a correct approach to fix the python bindings I would encourage you to go for the strongest and more meaningful approach, sendbuf should always be different that recvbuf (independent on the value of count). > My previous point was that barrier is the only collective that is *required* > to synchronize. > > -jms > Sent from my PDA. No type good. > > From: devel-boun...@open-mpi.org > To: de...@open-mpi.org > Sent: Thu Feb 11 07:04:59 2010 > Subject: Re: [OMPI devel] failure > withzero-lengthReduce()andbothsbuf=rbuf=NULL > > Where does bcast(1) synchronize? > > (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) ) > > -jms > Sent from my PDA. No type good. > > - Original Message - > From: devel-boun...@open-mpi.org > To: Open MPI Developers > Sent: Wed Feb 10 12:50:03 2010 > Subject: Re: [OMPI devel] failure with > zero-lengthReduce()andbothsbuf=rbuf=NULL > > On 10 February 2010 14:19, Jeff Squyres wrote: > > On Feb 10, 2010, at 11:59 AM, Lisandro Dalcin wrote: > > > >> > If I remember correctly, the HPCC pingpong test synchronizes > >> > occasionally by > >> > having one process send a zero-byte broadcast to all other processes. > >> > What's a zero-byte broadcast? Well, some MPIs apparently send no data, > >> > but > >> > do have synchronization semantics. (No non-root process can exit before > >> > the > >> > root process has entered.) Other MPIs treat the zero-byte broadcasts as > >> > no-ops; there is no synchronization and then timing results from the > >> > HPCC > >> > pingpong test are very misleading. So far as I can tell, the MPI > >> > standard > >> > doesn't address which behavior is correct. > >> > >> Yep... for p2p communication things are more clear (and behavior more > >> consistens in the MPI's out there) regarding zero-length messages... > >> IMHO, collectives should be non-op only in the sense that no actual > >> reduction is made because there are no elements to operate on. I mean, > >> if Reduce(count=1) implies a sync, Reduce(count=0) should also imply a > >> sync... > > > > Sorry to disagree again. :-) > > > > The *only* MPI collective operation that guarantees a synchronization is > > barrier. The lack of synchronization guarantee for all other collective > > operations is very explicit in the MPI spec. > > Of course. > > > Hence, it is perfectly valid for an MPI implementation to do something like > > a no-op when no data transfer actually needs to take place > > > > So you say that an MPI implementation is free to do make a sync in > case of Bcast(count=1), but not in the case of Bcast(count=0) ? I > could agree that such behavior is technically correct regarding the > MPI standard... But it makes me feel a bit uncomfortable... OK, in the > end, the change on semantic depending on message sizes is comparable > to the blocking/nonblocking one for MPI_Send(count=10^8) versus > Send(count=1). > > > > > (except, of course, the fact that Reduce(count=1) isn't defined ;-) ). > > > > You likely meant Reduce(count=0) ... Good catch ;-) > > > PS: The following question is unrelated to this thread, but my > curiosity+laziness cannot resist... Does Open MPI has some MCA > parameter to add a synchronization at every collective call? > > --
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
Jeff Squyres (jsquyres) wrote: Re: [OMPI devel] failure with zero-lengthReduce()andbothsbuf=rbuf=NULL I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't have a spec handy to check if bcast(0) is defined or not (similar to reduce). If it is, then sure, it could sync as well. My previous point was that barrier is the only collective that is *required* to synchronize. From: devel-boun...@open-mpi.org To: de...@open-mpi.org Sent: Thu Feb 11 07:04:59 2010 Subject: Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL Where does bcast(1) synchronize? (Oops on typo - if reduce(1) wasn't defined, that definitely would be bad :) ) To clarify my comments on this thread... There are causal synchronizations in all collectives. E.g., a non-root process cannot exit a broadcast before the root process has entered. The root process cannot exit a reduce before the last non-root process has entered. Stuff like that. Those were the only syncs I was talking about and the only sync that the HPCC pingpong test relied on. I wasn't talking about full barrier sync. Anyhow, a causal sync for a null collective is different. There is no data forcing synchronization. Unlike point-to-point messages, there isn't even header meta data. So what behavior is required in the case of null collectives? Incidentally, in what respect is reduce(0) not defined? It would seem to me that it would be an array of 0 length, so we don't need to worry about its datatype or contents.
[OMPI devel] Request_free() and Cancel() with REQUEST_NULL
Why Request_free() and Cancel() do not fail when REQUEST_NULL is passed? Am I missing something? #include int main(int argc, char *argv[]) { MPI_Request req; MPI_Init(&argc, &argv); req = MPI_REQUEST_NULL; MPI_Request_free(&req); req = MPI_REQUEST_NULL; MPI_Cancel(&req); MPI_Finalize(); return 0; } PS: The code below was tested with 1.4.1 -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
On Feb 11, 2010, at 10:04 AM, George Bosilca wrote: >> I misparsed your reply. Yes, bcast(1) *can* sync if it wants to. I don't >> have a spec handy to check if bcast(0) is defined or not (similar to >> reduce). If it is, then sure, it could sync as well. > > I have to disagree here. There are no synchronization in MPI except > MPI_Barrier. There's no synchronization *guarantee* in MPI collectives except for MPI_BARRIER. > At best, a bcast(1) is a one way synchronization, as the only knowledge it > gives to any rank (except root) is that the root has reached the bcast. No > assumptions about the other ranks should be made, as this is strongly > dependent on the underlying algorithm, and the upper level do not have a way > to know which algorithm is used. Similarly, a reduce(1) is the opposite of > the bcast(1), the only certain thing is at the root and it means all other > ranks had reached the reduce(1). I don't think we're disagreeing here. All I'm saying is that BCAST *can* synchronize; I'm not saying it has to. For example, using OMPI's sync coll module is perfectly legal because the MPI spec does not disallow synchronizing for collectives. MPI only *requires* synchronizing for BARRIER. Right? > Therefore, we can argue as much as you want about what the correct arguments > of a reduce call should be, a reduce(count=0) is one of the meaningless MPI > calls and as such should not be tolerated. No disagreement there. I wish we could error on it. "Darn the IMB torpedos! Full speed ahead!!" ;-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL
This has been corrected on the trunk (https://svn.open-mpi.org/trac/ompi/changeset/20537). Unfortunately, the corresponding patch didn't make it into the 1.4.1. I'll create a ticket to push it into the 1.4.2. george. On Feb 11, 2010, at 10:15 , Lisandro Dalcin wrote: > Why Request_free() and Cancel() do not fail when REQUEST_NULL is > passed? Am I missing something? > > #include > > int main(int argc, char *argv[]) > { > MPI_Request req; > MPI_Init(&argc, &argv); > req = MPI_REQUEST_NULL; > MPI_Request_free(&req); > req = MPI_REQUEST_NULL; > MPI_Cancel(&req); > MPI_Finalize(); > return 0; > } > > > PS: The code below was tested with 1.4.1 > > -- > Lisandro Dalcin > --- > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) > PTLC - Güemes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL
...and 1.5. :-) On Feb 11, 2010, at 10:53 AM, George Bosilca wrote: > This has been corrected on the trunk > (https://svn.open-mpi.org/trac/ompi/changeset/20537). Unfortunately, the > corresponding patch didn't make it into the 1.4.1. I'll create a ticket to > push it into the 1.4.2. > > george. > > On Feb 11, 2010, at 10:15 , Lisandro Dalcin wrote: > > > Why Request_free() and Cancel() do not fail when REQUEST_NULL is > > passed? Am I missing something? > > > > #include > > > > int main(int argc, char *argv[]) > > { > > MPI_Request req; > > MPI_Init(&argc, &argv); > > req = MPI_REQUEST_NULL; > > MPI_Request_free(&req); > > req = MPI_REQUEST_NULL; > > MPI_Cancel(&req); > > MPI_Finalize(); > > return 0; > > } > > > > > > PS: The code below was tested with 1.4.1 > > > > -- > > Lisandro Dalcin > > --- > > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) > > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) > > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) > > PTLC - Güemes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] Request_free() and Cancel() with REQUEST_NULL
On Feb 11, 2010, at 10:58 AM, Jeff Squyres (jsquyres) wrote: > ...and 1.5. :-) Err... never mind. It's already there. :-) /me slinks off into the night... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
On 11 February 2010 12:04, George Bosilca wrote: > > Therefore, we can argue as much as you want about what the correct arguments > of a reduce call should be, a reduce(count=0) is one of the meaningless MPI > calls and as such should not be tolerated. > Well, I have to disagree... I understand you (as an MPI implementor) think that Reduce(count=0) could be meaningless and add complexity to the implementation of MPI_Reduce()... But Reduce(count=0) could save user code of special-casing the count==0 situation... after all, zero-length arrays/sequences/containers do appear in actual codes... > Anyway, this discussion diverged from its original subject. The standard is > pretty clear on what set of arguments are valid, and the fact that the send > and receive buffers should be different is one of the strongest requirement > (and this independent on what count is). Sorry If count=0, why sendbuf!=recvbuff is SO STRONGLY required? I cannot figure out the answer... > As a courtesy, Open MPI accepts the heresy of a count = zero, but there is > __absolutely__ no reason to stop checking the values of the other arguments > when this is true. If the user really want to base the logic of his > application on such a useless and non-standard statement (reduce(0)) at least > he has to have the courtesy to provide a valid set of arguments. I'm still not convinced that recuce(0) is non-standard, as Jeff pointer out, the standard says "non-negative integer". The later comment is IMHO is not saying that count=0 is invalid, such a conclusion is a misinterpretation. What's would be the rationale of making Reduce(count=0) invalid, when all other (communication+reductions) collective calls do not explicitly say that count=0 is invalid, and "count" arguments are always described as "non-negative integer" ?? > > PS: If I can suggest a correct approach to fix the python bindings I would > encourage you to go for the strongest and more meaningful approach, sendbuf > should always be different that recvbuf (independent on the value of count). > I have the feeling that you think I'm bikeshedding because I'm lazy or I have nothing more useful to do :-)... That's not the case... I'm the developer of a MPI wrapper, it is not my business to impose arbitrary restrictions on users... then I would like MPI implementations to follow that rule... if count=0, I cannot see why I should restrict user to pass sendbuf!=recvbuf ... moreover, in a dynamic language like Python, things are not always obvious... Let me show you a little Python experiment... Enter you python prompt, and type this: $ python >>> from array import array >>> a = array('i', []) # one zero-length array of integers (C-int) >>> b = array('i', []) # other zero-length array >>> a is b # are 'a' and 'b' the same object instance? False >>> So far, so good.. we have two different arrays of integers, and their length is zero... Let's see the values of the (pointer, length), where the pointer is represented as its integer value: >>> a.buffer_info() (0, 0) >>> b.buffer_info() (0, 0) >>> Now, suppose I do this: >>> from mpi4py import MPI >>> MPI.COMM_WORLD.Reduce(a, b, op=MPI.SUM, root=0) Traceback (most recent call last): File "", line 1, in File "Comm.pyx", line 534, in mpi4py.MPI.Comm.Reduce (src/mpi4py.MPI.c:52115) mpi4py.MPI.Exception: MPI_ERR_ARG: invalid argument of some other kind >>> Then a mpi4py user mail me asking: WTF? 'a' and 'b' were different arrays, what's going on? why my call failed? And then I have to say: this fails because of two implementation details... Built-in Python's 'array.array' instances have pointer=NULL when lenght=0, and your MPI implementation requires sendbuf!=recvbuf, even if count=0 and sendbuf=recvbuf=NULL... Again, you may still thing that Reduce(count=0), or any other (count=0) is a nonsense, I may even agree with you... But IMHO that's not what the standard says, but again, imposing restrictions user codes should not be our business... Geoge, what could I do here? Should I forcibly pass a different,fake value enforcing sendbuff!=recvbuff myself when count=0? Would this be portable? What if other MPI implementation in some platform decides to complain because the fake value I'm passing does not represent a valid address? PS: Maintaining a MPI-2 binding for Python may requires a lot of care and attention to little details. And I have to support, Python>=2.3 and the new Python 3; on Windows, Linux and OS X, with many of the MPI-1 and MPI-2 implementations out there... Consistent behavior and standard compliance on MPI implementations is FUNDAMENTAL to develop portable wrappers for other languages... Unfortunately, things are not so easy; mpi4py's source code and testsuite is plenty of "if OMPI" and "if MPICH2"... -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Inv
[OMPI devel] MPI_Win_get_errhandler() and MPI_Win_set_errhandler() do not fail when passing MPI_WIN_NULL
I've reported this long ago (alongside other issues now fixed)... I can see that this is fixed in trunk and branches/v1.5, but not backported to branches/v1.4 Any chance to get this for 1.4.2? Or should it wait until 1.5? -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
Jeff Squyres wrote: There's no synchronization *guarantee* in MPI collectives except for MPI_BARRIER. [...] BCAST *can* synchronize; I'm not saying it has to. I fully agree with Jeff and would even go a step further. As has already been noted, there are also some implicit data dependencies due to the fact that we do "message passing". This means that a receiver can only get a message after the sender has posted it. So yes, all processes get their broadcast message only after the root called MPI_Bcast and the like. But does this necessarily imply that all processes block in such a call and return only after the senders joined the communication? In my opinion, no correct and portable MPI program should rely on anything that is not explicitly stated in the standard. Example to think about: I developed an MPI wrapper several years ago (for a slow interconnect), which almost immediately returned from blocking MPI calls. Instead of wasting time to wait for the senders, it utilized features of the virtual memory subsystem to protect the given message buffers from not-yet-allowed accesses (i.e., write access for send buffers and read access for receive buffer), and started the communication in the background like the nonblocking variants. The blocking (if at all) happened only at the time the data was actually accessed by the processor (so this implicit synchronization point we are taking about was just delayed). This enabled communication and computation overlap without rewriting the application (even for send operations or large messages due to pipelining) - just relink and see if it gets faster. I'm not totally sure that this is 100% MPI conform - but as long as programmers don't rely on anything that is not explicitly stated in the standard, they could benefit from such implementations...
[OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
WHAT: Rename ompi/include/mpi_portable_platform.h to be opal/include/opal_portable_platform.h WHY: The file includes definitions and macros that identify the compiler used to build the system, etc. The contents actually have nothing specific to do with MPI. WHEN:Weekend of Feb 20th I'm trying to rationalize the ompi_info system so that people who build different layers can still get a report of the MCA params, build configuration, etc. for the layers they build. Thus, there would be an "orte_info" and "opal_info" capability. Each would report not only the info for their own layer, but the layer(s) below. So ompi_info remains unchanged, orte_info reports ORTE and OPAL info, etc. The problem I encountered is that the referenced file is required for the various "info" tools, but it exists in the MPI layer. Since the file is only accessed at build time, I can go ahead and reference it from within "orte_info" and "opal_info", but it does somewhat break the abstraction barrier to do so. Given that the info in the file has nothing to do with MPI itself, it seemed reasonable to move it to opal...barring arguments to the contrary. Ralph
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
This is absolutely not true. Open MPI supports zero length collective operations (all of them actually), but if their arguments are correctly shaped. What you're asking for is a free ticket to write MPI calls that do not follow the MPI requirements when a special value for count is given. While zero-length arrays/sequence/containers do appears in real code, they are not equal to NULL. If they are NULL, that means they do not contain any useful data, and they don't need to be source or target of any kind of [collective or point-to-point] communications. george. On Feb 11, 2010, at 11:53 , Lisandro Dalcin wrote: > Well, I have to disagree... I understand you (as an MPI implementor) > think that Reduce(count=0) could be meaningless and add complexity to > the implementation of MPI_Reduce()... But Reduce(count=0) could save > user code of special-casing the count==0 situation... after all, > zero-length arrays/sequences/containers do appear in actual codes...
[OMPI devel] RFC: Processor affinity hardware thread support
WHAT: Add hardware thread support to processor affinity components and new options to orterun. WHY: OMPI currently does not correctly recognize processors that support hardware threads. In cases where the user uses the mpirun options -bind-to-* and -by-* processes are bound to the first thread on each core. In cases where the user specify cores to bind to in the rankfile those numbers are interpreted as thread ids as opposed to core ids. These ill side affects can lead to confusion as to which resources processes in a job are bound to and in the worse case a user could end up unknowingly oversubscribing resources. WHERE: orte/mca/rmaps, orte/mca/odls, orte/util/hostfile, orte/tools/orterun/orterun.c, opal/mca/paffinity WHEN: 03/15/10 TIMEOUT: 02/24/10 - The current OMPI paffinity implementation uses PLPA to set bindings of processes to cores or sockets. In systems that support hardware threads, however, PLPA looks at a hardware thread as a core and in certain cases may not be able to completely map all hardware threads. This happens because the paffinity framework does not recognize hardware threads. I propose support such that hardware thread resources can be identified and have processes bound to them. (Note: we plan on creating a new paffinity component using the hwloc api as opposed to extending the PLPA component) Once the paffinity framework supports hardware threads I would like to propose the following defaults and new options that will support hardware threads. I think we should first implement the "Defaults" section, put it back, and then start on new options and rankfile/hostfile fields. Defaults: In the case of no process binding we maintain the current rule of not doing anything. When -bind-to-core or a core binding defined in the rankfile, the MPI process will be bound to all hardware threads on a core (the OS will manage the scheduling of processes between hardware threads). This is similar to the how OMPI handles scheduling of processes on core when -bind-to-socket option is specified to mpirun. New Options to mpirun: 1. -bind-to-thread - Bind processes to hardware threads, analogous to -bind-to-core and -bind-to-socket 2. -threads-per-proc - Use the number of threads per process if used with one of the -bind-to* options 3. -bythread - Associate processes with successive hardware threads if used with one of the bind-to-* options. 4. -num-threads - Specify the number of hardware threads per core (for cases where Open MPI doesn't already know this information) New Fields to Rankfiles: We'll be adding a third field in the slot specification of the Rankfile. So a rankfile entry that has 3 fields specified for a slot the last field is the hardware thread id. Otherwise it is assumed that hardware thread scheduling is left up to the OS. rank 0=aa slot=1:0:0-3 rank 1=bb slot=0:0 rank 2=cc slot=1-2 So in the case of rank 0 the process is bound to socket 1, core 0 and hardware threads 0-3. In the case of rank 1 it is bound to socket 0, core 0 and hardware thread scheduling is left to the OS. In the case of rank 2 it is bound to cores 1 and 2 hardware thread scheduling is left to the OS.
Re: [OMPI devel] failure withzero-lengthReduce()andbothsbuf=rbuf=NULL
On 11 February 2010 15:06, George Bosilca wrote: > This is absolutely not true. Open MPI supports zero length collective > operations (all of them actually), but if their arguments are correctly > shaped. > OK, you are right here ... > What you're asking for is a free ticket to write MPI calls that do not follow > the MPI requirements when a special value for count is given. > But you did not answer my previous question... What's the rationale for requiring sendbuf!=recvbuf when count=0? I would argue you want a free ticket :-) to put restrictions on user code (without an actual rationale) in order to simplify your implementation. > While zero-length arrays/sequence/containers do appears in real code, they > are not equal to NULL. If they are NULL, that means they do not contain any > useful data, and they don't need to be source or target of any kind of > [collective or point-to-point] communications. > Yes, I know. Moreover, I agree with you. NULL should be reserved for invalid pointers, not for zero-length array... The problem is that people out there seem to disagree or just do not pay any attention to this, thus (pointer=NULL,length=0) DO APPEAR in real life (like the Python example I previously showed you)... Additionally, some time ago (while discussing MPI_Alloc_mem(size=0)) we commented on the different return values for malloc(0) depending on the platform... Well, this discussion got too far... In the end, I agree that representing zero-length arrays with (pointer=NULL,length=0) should be regarded as bad practice... -- Lisandro Dalcin --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI devel] MPI_Win_get_errhandler() and MPI_Win_set_errhandler()do not fail when passing MPI_WIN_NULL
Yes -- it should be in 1.4.2 -- the CMR George filed after your mail earlier today includes both the REQUEST_NULL and WIN_*_ERRHANDLER stuff: https://svn.open-mpi.org/trac/ompi/ticket/2257 I just added you to the CC. BUT: I think we should be careful with r20537; if we bring that over (and I think we should), we should *also* bring over r20616 because of a change I instigated in MPI-2.2 because of exactly this issue. All of this stuff is already in the v1.5 branch. Thanks for keeping after us! On Feb 11, 2010, at 12:06 PM, Lisandro Dalcin wrote: > I've reported this long ago (alongside other issues now fixed)... > > I can see that this is fixed in trunk and branches/v1.5, but not > backported to branches/v1.4 > > Any chance to get this for 1.4.2? Or should it wait until 1.5? > > > -- > Lisandro Dalcin > --- > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) > PTLC - Güemes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] failurewithzero-lengthReduce()andbothsbuf=rbuf=NULL
On Feb 11, 2010, at 2:14 PM, Lisandro Dalcin wrote: > But you did not answer my previous question... What's the rationale > for requiring sendbuf!=recvbuf when count=0? I would argue you want a > free ticket :-) to put restrictions on user code (without an actual > rationale) in order to simplify your implementation. I don't understand your assertion. The MPI spec clearly says that sendbuf must != recvbuf. If you want the sendbuf to be the same as the recvbuf, MPI supports MPI_IN_PLACE for several operations. I realize that's not what you're trying to do, but these are the semantics that MPI has defined. > > While zero-length arrays/sequence/containers do appears in real code, they > > are not equal to NULL. If they are NULL, that means they do not contain any > > useful data, and they don't need to be source or target of any kind of > > [collective or point-to-point] communications. And even stronger than this: remember that NULL *is* a valid pointer for MPI when it is paired with an appropriate datatype. As I said in an earlier mail, NULL is therefore not a special case buffer for sendbuf or recvbuf. To be absolutely clear: none of OMPI's MPI API calls have checks of the form: if (NULL == choice_buffer) return error; > Yes, I know. Moreover, I agree with you. NULL should be reserved for > invalid pointers, not for zero-length array... But it is not. MPI's datatype mechanism is so general that NULL is valid. So yes, passing MPI_REDUCE(NULL, NULL, ...) violates the sendbuf!=recvbuf rule (partially because there is only one datatype in MPI_REDUCE). If a language may convert a buffer representation to NULL for you behind the scenes, then it's up to the language binding to catch/correct that. ...at least by the wording in today's MPI spec. That being said, your python example of buffers a and b unknowingly being transformed to NULL behind the scenes seems like a good thing that MPI should support better. It's exactly these kinds of issues that would be helpful to know / discuss / propose improvements for MPI-3. Could we convince you to come to a Forum meeting? :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
My only $0.02 is that if we rename it to opal_portable_platform.h, we must remember that this file is #included in mpi.h, and therefore it is installed in user OMPI installations. $includedir/mpi_portable_platform.h was deemed to be a "safe" filename. But we've already had a name conflict with "opal" -- so I'm not sure that $includedir/opal_portable_platform.h is a "safe" filename. We might consider installing it in $includedir/openmpi/opal_portable_platform.h... or something like that (I realize that $includeddir/openmpi/ is not necessarily a good choice for OPAL and ORTE standalone projects; so perhaps a little creativity is required here...). On Feb 11, 2010, at 12:33 PM, Ralph Castain wrote: > WHAT: Rename ompi/include/mpi_portable_platform.h to be > opal/include/opal_portable_platform.h > > WHY: The file includes definitions and macros that identify the compiler > used to build the system, etc. > The contents actually have nothing specific to do with MPI. > > WHEN:Weekend of Feb 20th > > > > I'm trying to rationalize the ompi_info system so that people who build > different layers can still get a report of the MCA params, build > configuration, etc. for the layers they build. Thus, there would be an > "orte_info" and "opal_info" capability. Each would report not only the info > for their own layer, but the layer(s) below. So ompi_info remains unchanged, > orte_info reports ORTE and OPAL info, etc. > > The problem I encountered is that the referenced file is required for the > various "info" tools, but it exists in the MPI layer. Since the file is only > accessed at build time, I can go ahead and reference it from within > "orte_info" and "opal_info", but it does somewhat break the abstraction > barrier to do so. > > Given that the info in the file has nothing to do with MPI itself, it seemed > reasonable to move it to opal...barring arguments to the contrary. > > Ralph > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
I wouldn't change the installation location - just thought it would be good to avoid the abstraction break in the source code. Remember - this file doesn't get installed at all unless we built the MPI layer... On Feb 11, 2010, at 1:11 PM, Jeff Squyres wrote: > My only $0.02 is that if we rename it to opal_portable_platform.h, we must > remember that this file is #included in mpi.h, and therefore it is installed > in user OMPI installations. > > $includedir/mpi_portable_platform.h was deemed to be a "safe" filename. But > we've already had a name conflict with "opal" -- so I'm not sure that > $includedir/opal_portable_platform.h is a "safe" filename. We might consider > installing it in $includedir/openmpi/opal_portable_platform.h... or something > like that (I realize that $includeddir/openmpi/ is not necessarily a > good choice for OPAL and ORTE standalone projects; so perhaps a little > creativity is required here...). > > > > On Feb 11, 2010, at 12:33 PM, Ralph Castain wrote: > >> WHAT: Rename ompi/include/mpi_portable_platform.h to be >> opal/include/opal_portable_platform.h >> >> WHY: The file includes definitions and macros that identify the compiler >> used to build the system, etc. >> The contents actually have nothing specific to do with MPI. >> >> WHEN:Weekend of Feb 20th >> >> >> >> I'm trying to rationalize the ompi_info system so that people who build >> different layers can still get a report of the MCA params, build >> configuration, etc. for the layers they build. Thus, there would be an >> "orte_info" and "opal_info" capability. Each would report not only the info >> for their own layer, but the layer(s) below. So ompi_info remains unchanged, >> orte_info reports ORTE and OPAL info, etc. >> >> The problem I encountered is that the referenced file is required for the >> various "info" tools, but it exists in the MPI layer. Since the file is only >> accessed at build time, I can go ahead and reference it from within >> "orte_info" and "opal_info", but it does somewhat break the abstraction >> barrier to do so. >> >> Given that the info in the file has nothing to do with MPI itself, it seemed >> reasonable to move it to opal...barring arguments to the contrary. >> >> Ralph >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > -- > Jeff Squyres > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] documenting the PMPI profiling interface
In the MPI standard, the portion discussing the PMPI profiling interface says: 3. document the implementation of different language bindings of the MPI interface if they are layered on top of each other, so that the profiler developer knows whether she must implement the profile interface for each binding, or can economise by implementing it only for the lowest level routines. http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313 Do we have such documentation anywhere? I don't see this in the OMPI FAQ. I played with this some. I wrote a Fortran program that called MPI_Send. I wrote a Fortran wrapper that intercepted MPI_Send and called PMPI_Send. I wrote a C wrapper that did the same thing. It appears that both wrappers got called. So, it looks like we should advise users to provide *only* C wrappers (unless they *also* want to intercept at the Fortran level). Yes/no?
Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
On Feb 11, 2010, at 3:57 PM, Ralph Castain wrote: > I wouldn't change the installation location - just thought it would be good > to avoid the abstraction break in the source code. > > Remember - this file doesn't get installed at all unless we built the MPI > layer... Hmm. That becomes an interesting abstraction break in itself -- a Makefile.am in opal has to know if we're building / installing the MPI layer... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] documenting the PMPI profiling interface
On Feb 11, 2010, at 4:13 PM, Eugene Loh wrote: > In the MPI standard, the portion discussing the PMPI profiling interface says: > > 3. document the implementation of different language > bindings of the MPI interface if they are layered on top > of each other, so that the profiler developer knows > whether she must implement the profile interface for > each binding, or can economise by implementing it > only for the lowest level routines. > > http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313 > > Do we have such documentation anywhere? I don't see this in the OMPI FAQ. > > I played with this some. I wrote a Fortran program that called MPI_Send. I > wrote a Fortran wrapper that intercepted MPI_Send and called PMPI_Send. I > wrote a C wrapper that did the same thing. It appears that both wrappers got > called. So, it looks like we should advise users to provide *only* C > wrappers (unless they *also* want to intercept at the Fortran level). > > Yes/no? Yes. Mostly. I believe there are a small number of exceptions to this... (/me checks...) Ah yes, here's one: MPI_ERRHANDLER_CREATE() in Fortran does *not* call MPI_Errhandler_create(). Instead, it calls the back-end ompi_errhandler_create() function. There's obscure reasons for this that are pretty uninteresting. To be clear: if you profile this function in both C and Fortran and call it in Fortran, you *won't* see the corresponding C profile function invoked. I don't know if there's an easy way to generate a full list of functions like this -- it might involve a troll through ompi/mpi/f77/*_f.c to see which ones call MPI_* functions for their back-end functionality vs. which ones don't. I think most call MPI_* functions. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90
Jeff Squyres wrote: [about -D not working with xlf] If we care, it is easy enough to add a configure test to figure this kind of stuff out. Might be worth logging a bug with the autotools/autoconf people on this (if it's not already there), it's been mentioned recently on their lists as something they should look at doing better: http://old.nabble.com/Re:-Can't-use-Fortran-90-95-compiler-for-F77-p26209677.html cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [OMPI devel] documenting the PMPI profiling interface
Jeff Squyres wrote: On Feb 11, 2010, at 4:13 PM, Eugene Loh wrote: In the MPI standard, the portion discussing the PMPI profiling interface says: 3. document the implementation of different language bindings of the MPI interface if they are layered on top of each other, so that the profiler developer knows whether she must implement the profile interface for each binding, or can economise by implementing it only for the lowest level routines. http://www.mpi-forum.org/docs/mpi22-report/node313.htm#Node313 Do we have such documentation anywhere? I don't see this in the OMPI FAQ. I played with this some. I wrote a Fortran program that called MPI_Send. I wrote a Fortran wrapper that intercepted MPI_Send and called PMPI_Send. I wrote a C wrapper that did the same thing. It appears that both wrappers got called. So, it looks like we should advise users to provide *only* C wrappers (unless they *also* want to intercept at the Fortran level). Yes/no? Yes. Mostly. I believe there are a small number of exceptions to this... (/me checks...) Ah yes, here's one: MPI_ERRHANDLER_CREATE() in Fortran does *not* call MPI_Errhandler_create(). Instead, it calls the back-end ompi_errhandler_create() function. There's obscure reasons for this that are pretty uninteresting. To be clear: if you profile this function in both C and Fortran and call it in Fortran, you *won't* see the corresponding C profile function invoked. I don't know if there's an easy way to generate a full list of functions like this -- it might involve a troll through ompi/mpi/f77/*_f.c to see which ones call MPI_* functions for their back-end functionality vs. which ones don't. I think most call MPI_* functions. And I can imagine there are cases where you'd rather write the wrapper in the native language (e.g., Fortran) than C if handles are handled differently or something. Back to the opening question: is this documented anywhere? (Such documentation *is* a requirement of the standard and OMPI is standard conforming, y'know.)
Re: [OMPI devel] RFC: Adding -DOPEN_MPI=1 to mpif77 and mpif90
Mm... good to know. Thanks! On Feb 11, 2010, at 5:58 PM, Chris Samuel wrote: > Jeff Squyres wrote: > > [about -D not working with xlf] > > > If we care, it is easy enough to add a configure test to > > figure this kind of stuff out. > > Might be worth logging a bug with the autotools/autoconf > people on this (if it's not already there), it's been > mentioned recently on their lists as something they > should look at doing better: > > http://old.nabble.com/Re:-Can't-use-Fortran-90-95-compiler-for-F77-p26209677.html > > cheers, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] documenting the PMPI profiling interface
On Feb 11, 2010, at 6:08 PM, Eugene Loh wrote: > Back to the opening question: is this documented anywhere? (Such > documentation *is* a requirement of the standard and OMPI is standard > conforming, y'know.) The code? ;-) No, I don't believe we document this stuff anywhere. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
Hi Ralph, hmm, I don't really care about the name itselve. As Jeff mentioned, we'd have a "abstraction break" either way. The question I have, why does orte_info need to include the information, which compiler it was compiled with ;-)? We basically only care to warn users about a typical MPI-user compilation mismatch (C gcc+ Fortran pgf77). So, I would rather keep it as is... Regards, Rainer On Thursday 11 February 2010 12:33:28 pm Ralph Castain wrote: > WHAT: Rename ompi/include/mpi_portable_platform.h to be > opal/include/opal_portable_platform.h > > WHY: The file includes definitions and macros that identify the compiler > used to build the system, etc. The contents actually have nothing specific > to do with MPI. > > WHEN:Weekend of Feb 20th > > > > I'm trying to rationalize the ompi_info system so that people who build > different layers can still get a report of the MCA params, build > configuration, etc. for the layers they build. Thus, there would be an > "orte_info" and "opal_info" capability. Each would report not only the > info for their own layer, but the layer(s) below. So ompi_info remains > unchanged, orte_info reports ORTE and OPAL info, etc. > > The problem I encountered is that the referenced file is required for the > various "info" tools, but it exists in the MPI layer. Since the file is > only accessed at build time, I can go ahead and reference it from within > "orte_info" and "opal_info", but it does somewhat break the abstraction > barrier to do so. > > Given that the info in the file has nothing to do with MPI itself, it > seemed reasonable to move it to opal...barring arguments to the contrary. > > Ralph > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Rainer Keller, PhD Tel: +1 (865) 241-6293 Oak Ridge National Lab Fax: +1 (865) 241-4811 PO Box 2008 MS 6164 Email: kel...@ornl.gov Oak Ridge, TN 37831-2008AIM/Skype: rusraink
Re: [OMPI devel] documenting the PMPI profiling interface
Jeff Squyres wrote: On Feb 11, 2010, at 6:08 PM, Eugene Loh wrote: Back to the opening question: is this documented anywhere? (Such documentation *is* a requirement of the standard and OMPI is standard conforming, y'know.) The code? ;-) No, I don't believe we document this stuff anywhere. You lie like a dog. Look again! Er, best to wait for the FAQ refresh tonight. And, careful about brushing up against the fresh paint.
Re: [OMPI devel] RFC: Rename ompi/include/mpi_portable_platform.h
On Feb 11, 2010, at 4:45 PM, Rainer Keller wrote: > Hi Ralph, > hmm, I don't really care about the name itselve. > As Jeff mentioned, we'd have a "abstraction break" either way. There is no abstraction break - I talked to Jeff about it and cleared up the confusion. The OMPI code will have an install line that installs the opal file in a to-be-determined place where mpi.h can include it. No "mpi" references required in OPAL. > > The question I have, why does orte_info need to include the information, > which > compiler it was compiled with ;-)? Because people create cross-compiled versions, use module files to define which one they are using, etc. > > We basically only care to warn users about a typical MPI-user compilation > mismatch (C gcc+ Fortran pgf77). Not quite correct - you need to know that it was built to cross-compile vs native. HTH Ralph > > So, I would rather keep it as is... > > > Regards, > Rainer > > > > On Thursday 11 February 2010 12:33:28 pm Ralph Castain wrote: >> WHAT: Rename ompi/include/mpi_portable_platform.h to be >> opal/include/opal_portable_platform.h >> >> WHY: The file includes definitions and macros that identify the compiler >> used to build the system, etc. The contents actually have nothing specific >> to do with MPI. >> >> WHEN:Weekend of Feb 20th >> >> >> >> I'm trying to rationalize the ompi_info system so that people who build >> different layers can still get a report of the MCA params, build >> configuration, etc. for the layers they build. Thus, there would be an >> "orte_info" and "opal_info" capability. Each would report not only the >> info for their own layer, but the layer(s) below. So ompi_info remains >> unchanged, orte_info reports ORTE and OPAL info, etc. >> >> The problem I encountered is that the referenced file is required for the >> various "info" tools, but it exists in the MPI layer. Since the file is >> only accessed at build time, I can go ahead and reference it from within >> "orte_info" and "opal_info", but it does somewhat break the abstraction >> barrier to do so. >> >> Given that the info in the file has nothing to do with MPI itself, it >> seemed reasonable to move it to opal...barring arguments to the contrary. >> >> Ralph >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > -- > > Rainer Keller, PhD Tel: +1 (865) 241-6293 > Oak Ridge National Lab Fax: +1 (865) 241-4811 > PO Box 2008 MS 6164 Email: kel...@ornl.gov > Oak Ridge, TN 37831-2008AIM/Skype: rusraink >