Re: [OMPI devel] mosix patches
It was never committed to OMPI - it became stuck in a side branch when the developer graduated and took a job, and never came across. Given the age, I'd suspect that side branch is way out-of-date and probably would need some significant effort before it could be brought into the OMPI trunk, assuming someone took the effort to do so. On Apr 24, 2014, at 9:07 AM, Pavel V. Kaygorodov wrote: > Hi! > > What is current status of mosix support in OpenMPI? > I have tried patches from > http://www.cs.huji.ac.il/wikis/MediaWiki/mosix/index.php/Process_migration_for_OpenMPI > but without any success, even on 1.6 branch. > > With best regards, > Pavel. > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14593.php
Re: [OMPI devel] RFC: Remove heterogeneous support
On Apr 24, 2014, at 12:05 PM, Andreas Schäfer wrote: > Hey, > > On 14:49 Thu 24 Apr , George Bosilca wrote: >> On Thu, Apr 24, 2014 at 1:06 PM, Jeff Squyres (jsquyres) >> wrote: >>> The code is unused. It has been unused for a long time. It is >> unlikely to be fixed. > > We'd be using it, probably not in production, but in research and > teaching -- if it was operational. > > And, as George pointed out, I see a trend towards heterogeneity in > HPC, to I'd say this feature will be rather more important in the > future. We have been hearing about such "trends" for a long time, but have yet to see them actually happen. Not saying it couldn't some day - just saying it still hasn't happened in production. > >> PS: This code has implications from the datatype engine till up in the >> MPI layer. It also impacts the BTL, especially the hand-shake for the >> one requiring such a protocol. It also has an impact on the external32 >> support in MPI, for some types of architectures. So it's removal >> should be an extremely cautious and chirurgical operation. > > So, would repairing the code be significantly more complicated than a > clean extraction? Unless someone volunteers to fix it, it would seem the question is moot. My employer isn't interested, and I'm not sure any of the employer's within the OMPI community currently are inclined to support such an effort. I can't speak to what George is referring to re how it was broken as I honestly don't recall the circumstances. We know it has been broken for some time, and that nobody really has a setup to test it - we can check that it compiles, but I don't think any of us actually have a hetero cluster upon which we could test it. And as my production code friends keep pointing out - if you can't test it, then you can't "sell" it. So here's what I suggest: if someone is willing to take the lead in fixing hetero operations, and has the hardware upon which to verify it, then please step forward. Otherwise, I agree with Jeff that we should remove it and move on. > > Cheers > -Andreas > > > -- > == > Andreas Schäfer > HPC and Grid Computing > Chair of Computer Science 3 > Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany > +49 9131 85-27910 > PGP/GPG key via keyserver > http://www.libgeodecomp.org > == > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14598.php
Re: [OMPI devel] RFC: Well-known mca parameters
Just for clarification: are you proposing that we *require* every component that links against an external library to include these parameters? If so, that seems a significant requirement as quite a few of our components do so. On the other hand, if you are proposing that those component developers who choose to expose such information do so using the suggested syntax, then that is a different proposal. Just want to understand what you are proposing - a requirement on components, or a syntax for those who choose to support this capability? FWIW: we do not (and cannot, for licensing reasons) link against Slurm, so please don't include it in such lists to avoid giving anyone even a passing thought that we do so. On Apr 23, 2014, at 10:38 PM, Mike Dubman wrote: > > WHAT: > * Formalize well-known MCA parameters that can be used by any component to > represent external dependencies for this component. > > * Component can set that well-known MCA r/o parameters to expose to the > end-users different setup related traits of the OMPI installation. > > Example: > > ompi_info can print for every component which depends on external library: > - ext lib runtime version used by component > - ext lib compiletime version used by component > > slurm: v2.6.6 > mtl/mxm: v2.5 > btl/verbs: v3.2 > btl/usnic: v1.1 > coll/fca: v2.5 > ... > > End-user or site admin or OMPI vendor can aggregate this information by some > script and generate report if given installation compiles with site/vendor > rules. > > * The "well-known" mca parameters can be easily extracted from ALL components > by grep-like utilities. > > * Current proposal: > > ** prefix each well-known MCA param with "print_": > ** Define two well-known mca parameters indicating external library runtime > and compiletime versions, i.e.: > > print_compiletime_version > print_runtime_version > > The following command will show all exposed well-known mca params from all > components: > ompi_info --parsable -l 9 |grep ":print_" > > > WHY: > > * Better support-ability: site/vendor can provide script which will check if > OMPI installation complies to release notes or support matrix. > > > WHEN: > > - Next teleconf > - code can be observed here: https://svn.open-mpi.org/trac/ompi/ticket/4556 > > > Comments? > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14590.php
Re: [OMPI devel] MPI_Recv_init_null_c from intel test suite fails vs ompi trunk
The problem was not in the start but in the wait (hint: the status is set in the wait). The difference I guess is r27880, which seems not to be in the 1.8. So, the 1.8 is not returning the correct status for persistent inactive requests, but it does the right thing for MPI_PROC_NULL bound requests. George. On Thu, Apr 24, 2014 at 6:19 PM, Jeff Squyres (jsquyres) wrote: > George -- > > Any idea why it isn't failing on the v1.8 branch? The only major difference > I see between mpi/c/start.c between trunk and v1.8 is your change. > > > > On Apr 24, 2014, at 2:08 PM, George Bosilca wrote: > >> r31524 is fixing this corner case. The problem was that persistent >> request with MPI_RPOC_NULL were never activated, so the wait* function >> was taking the if corresponding to inactive requests. >> >> George. >> >> On Thu, Apr 24, 2014 at 12:14 AM, Gilles Gouaillardet >> wrote: >>> Folks, >>> >>> Here is attached an oversimplified version of the MPI_Recv_init_null_c >>> test from the >>> intel test suite. >>> >>> the test works fine with v1.6, v1.7 and v1.8 branches but fails with the >>> trunk. >>> >>> i wonder wether the bug is in OpenMPI or the test itself. >>> >>> on one hand, we could consider there is a bug in OpenMPI : >>> status.MPI_SOURCE should be MPI_PROC_NULL since we explicitly posted a >>> recv request with MPI_PROC_NULL. >>> >>> on the other hand, (mpi specs, chapter 3.7.3 and >>> https://svn.open-mpi.org/trac/ompi/ticket/3475) >>> we could consider the returned value is not significant, and hence >>> MPI_Wait should return an >>> empty status (and empty status has source=MPI_ANY_SOURCE per the MPI specs). >>> >>> for what it's worth, this test is a success with mpich (e.g. >>> status.MPI_SOURCE is MPI_PROC_NULL). >>> >>> >>> what is the correct interpretation of the MPI specs and what should be >>> done ? >>> (e.g. fix OpenMPI or fix/skip the test ?) >>> >>> Cheers, >>> >>> Gilles >>> >>> ___ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/04/14589.php >> ___ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/04/14596.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14599.php
Re: [OMPI devel] MPI_Recv_init_null_c from intel test suite fails vs ompi trunk
George -- Any idea why it isn't failing on the v1.8 branch? The only major difference I see between mpi/c/start.c between trunk and v1.8 is your change. On Apr 24, 2014, at 2:08 PM, George Bosilca wrote: > r31524 is fixing this corner case. The problem was that persistent > request with MPI_RPOC_NULL were never activated, so the wait* function > was taking the if corresponding to inactive requests. > > George. > > On Thu, Apr 24, 2014 at 12:14 AM, Gilles Gouaillardet > wrote: >> Folks, >> >> Here is attached an oversimplified version of the MPI_Recv_init_null_c >> test from the >> intel test suite. >> >> the test works fine with v1.6, v1.7 and v1.8 branches but fails with the >> trunk. >> >> i wonder wether the bug is in OpenMPI or the test itself. >> >> on one hand, we could consider there is a bug in OpenMPI : >> status.MPI_SOURCE should be MPI_PROC_NULL since we explicitly posted a >> recv request with MPI_PROC_NULL. >> >> on the other hand, (mpi specs, chapter 3.7.3 and >> https://svn.open-mpi.org/trac/ompi/ticket/3475) >> we could consider the returned value is not significant, and hence >> MPI_Wait should return an >> empty status (and empty status has source=MPI_ANY_SOURCE per the MPI specs). >> >> for what it's worth, this test is a success with mpich (e.g. >> status.MPI_SOURCE is MPI_PROC_NULL). >> >> >> what is the correct interpretation of the MPI specs and what should be >> done ? >> (e.g. fix OpenMPI or fix/skip the test ?) >> >> Cheers, >> >> Gilles >> >> ___ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/04/14589.php > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14596.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Remove heterogeneous support
Hey, On 14:49 Thu 24 Apr , George Bosilca wrote: > On Thu, Apr 24, 2014 at 1:06 PM, Jeff Squyres (jsquyres) > wrote: > > The code is unused. It has been unused for a long time. It is > unlikely to be fixed. We'd be using it, probably not in production, but in research and teaching -- if it was operational. And, as George pointed out, I see a trend towards heterogeneity in HPC, to I'd say this feature will be rather more important in the future. > PS: This code has implications from the datatype engine till up in the > MPI layer. It also impacts the BTL, especially the hand-shake for the > one requiring such a protocol. It also has an impact on the external32 > support in MPI, for some types of architectures. So it's removal > should be an extremely cautious and chirurgical operation. So, would repairing the code be significantly more complicated than a clean extraction? Cheers -Andreas -- == Andreas Schäfer HPC and Grid Computing Chair of Computer Science 3 Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany +49 9131 85-27910 PGP/GPG key via keyserver http://www.libgeodecomp.org == (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! signature.asc Description: Digital signature
Re: [OMPI devel] RFC: Remove heterogeneous support
On Thu, Apr 24, 2014 at 1:06 PM, Jeff Squyres (jsquyres) wrote: > On Apr 24, 2014, at 12:54 PM, George Bosilca wrote: > >> There seems to be an opportunity to still have heterogeneous environment in >> the future. >> http://www.enterprisetech.com/2014/04/23/ibm-google-show-power8-systems-openpower-efforts/ > > How so? As the link I sent highlight, there is a push, a reasonable effort, to bring another processor family into mainstream. This open the potential for the dawn of heterogeneous data centers, thus the need for at least some basic support for heterogeneous environments. > >> I don’t think it is fair to shift the burden on the original developer >> instead of the committer who broke a feature. > > I don't see how your comment is related to this RFC. Because I have the feeling the logic behind the RFC is: it is broken and must be removed because nobody wants to fix it. And I don't agree with this logic. This particular code was working and was used but incompetence and carelessness (in any arbitrary order) broke it. > > The code is unused. It has been unused for a long time. It is unlikely to > be fixed. I wrote a significant portion of the code pinpointed in this RFC, and maintained it for a reasonable amount of time, despite a number of careless commits. But today, you are right, I have no intention in fixing it anymore, and I don't think anybody wants to volunteer for such a chore. George. PS: This code has implications from the datatype engine till up in the MPI layer. It also impacts the BTL, especially the hand-shake for the one requiring such a protocol. It also has an impact on the external32 support in MPI, for some types of architectures. So it's removal should be an extremely cautious and chirurgical operation. > > Why not remove it? > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14595.php
Re: [OMPI devel] MPI_Recv_init_null_c from intel test suite fails vs ompi trunk
r31524 is fixing this corner case. The problem was that persistent request with MPI_RPOC_NULL were never activated, so the wait* function was taking the if corresponding to inactive requests. George. On Thu, Apr 24, 2014 at 12:14 AM, Gilles Gouaillardet wrote: > Folks, > > Here is attached an oversimplified version of the MPI_Recv_init_null_c > test from the > intel test suite. > > the test works fine with v1.6, v1.7 and v1.8 branches but fails with the > trunk. > > i wonder wether the bug is in OpenMPI or the test itself. > > on one hand, we could consider there is a bug in OpenMPI : > status.MPI_SOURCE should be MPI_PROC_NULL since we explicitly posted a > recv request with MPI_PROC_NULL. > > on the other hand, (mpi specs, chapter 3.7.3 and > https://svn.open-mpi.org/trac/ompi/ticket/3475) > we could consider the returned value is not significant, and hence > MPI_Wait should return an > empty status (and empty status has source=MPI_ANY_SOURCE per the MPI specs). > > for what it's worth, this test is a success with mpich (e.g. > status.MPI_SOURCE is MPI_PROC_NULL). > > > what is the correct interpretation of the MPI specs and what should be > done ? > (e.g. fix OpenMPI or fix/skip the test ?) > > Cheers, > > Gilles > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14589.php
Re: [OMPI devel] RFC: Remove heterogeneous support
On Apr 24, 2014, at 12:54 PM, George Bosilca wrote: > There seems to be an opportunity to still have heterogeneous environment in > the future. > http://www.enterprisetech.com/2014/04/23/ibm-google-show-power8-systems-openpower-efforts/ How so? > I don’t think it is fair to shift the burden on the original developer > instead of the committer who broke a feature. I don't see how your comment is related to this RFC. The code is unused. It has been unused for a long time. It is unlikely to be fixed. Why not remove it? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: Remove heterogeneous support
There seems to be an opportunity to still have heterogeneous environment in the future. http://www.enterprisetech.com/2014/04/23/ibm-google-show-power8-systems-openpower-efforts/ I don’t think it is fair to shift the burden on the original developer instead of the committer who broke a feature. George. On Apr 23, 2014, at 09:49 , Jeff Squyres (jsquyres) wrote: > WHAT: Remove data-heterogeneous support from Open MPI > > WHY: No one uses it (it's not the default), it's broken (probably has been > for a while) > > WHERE: Datatype engine, some configury, and a few other places > > TIMEOUT: Tuesday teleconf, 6 May 2014 (i.e., 2 weeks from now) > > MORE DETAIL: > > It recently came to my attention that we seem to have some bit rot in the > heterogeneous data representation support such that if you configure with > --enable-heterogeneous, even if you run on homogeneous machines, you can get > segv's with tcp,sm,self. > > The heterogeneous support has never been enabled by default. AFAIK, only > Cisco tests it regularly in its MTT. I'm be greatly surprised if many (any?) > users use it at all. > > So I have to ask myself: why do we keep this functionality around? It seems > like we should delete this code, simplify things a little, and move on. > > Comments? > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14584.php
[OMPI devel] mosix patches
Hi! What is current status of mosix support in OpenMPI? I have tried patches from http://www.cs.huji.ac.il/wikis/MediaWiki/mosix/index.php/Process_migration_for_OpenMPI but without any success, even on 1.6 branch. With best regards, Pavel.
Re: [OMPI devel] Bug report: non-blocking allreduce with user-defined operation gives segfault
Hi George, Having looked again you're correct about the two 2buf reductions being wrong. For now, I've updated my patch of nbc.c to copy buf1 into buf3 and then do buf3 OP= buf2 (see below). Patching ompi_3buff_op_reduce to cope with user-defined operations is certainly possible, but I don't really understand the implications of doing that for the rest of the codebase (this is the first time I've looked at the internals of OpenMPI). Best, Rupert if (ompi_op_is_intrinsic(opargs.op)) { /* This does buf3 = buf1 OP buf2 */ ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, opargs.datatype); } else { /* Copy buf1 -> buf3 (if necessary) * then do buf3 OP= buf2 * If the output is the same as the first input, we don't need to copy * This only applies to the second if the operator commutes */ if (buf1 == buf3) { ompi_op_reduce(opargs.op, buf2, buf3, opargs.count, opargs.datatype); } else if (buf2 == buf3 && ompi_op_is_commute(opargs.op)) { ompi_op_reduce(opargs.op, buf1, buf3, opargs.count, opargs.datatype); } else { res = NBC_Copy(buf1, opargs.count, opargs.datatype, buf3, opargs.count, opargs.datatype, handle->comm); if(res != NBC_OK) { printf("NBC_Copy() failed (code: %i)\n", res); ret=res; goto error; } ompi_op_reduce(opargs.op, buf2, buf3, opargs.count, opargs.datatype); } } > Rupert, > > You are right, the code of any non-blocking reduce is not built with > user-level op in mind. However, I'm not sure about your patch. One > reason is that ompi_3buff is doing target = source1 op source2 while >ompi_2buf is doing target op= source (notice the op=) > > Thus you can't replace ompi_3buff by 2 ompi_2buff because you > basically replace target = source1 op source2 by target op= source1 op > source2 > > Moreover, I much nicer solution will be to patch directly the > ompi_3buff_op_reduce function in op.h to fallback to a user defined > function when necessary. > > George. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
[OMPI devel] Patch to fix valgrind warning
Please review the attached patch, ==19533== Conditional jump or move depends on uninitialised value(s) ==19533==at 0x140DAB78: component_select (osc_sm_component.c:352) ==19533==by 0xD9BA0B2: ompi_osc_base_select (osc_base_init.c:73) ==19533==by 0xD9314C1: ompi_win_allocate (win.c:182) ==19533==by 0xD982C4E: PMPI_Win_allocate (pwin_allocate.c:79) ==19533==by 0xD628887: __pyx_pw_6mpi4py_3MPI_3Win_11Allocate (mpi4py.MPI.c:109170) ==19533==by 0x38442E0BD3: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442E21EC: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442E22F1: PyEval_EvalCode (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F20DB: PyImport_ExecCodeModuleEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F2357: ??? (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F2FF0: ??? (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F323C: ??? (in /usr/lib64/libpython2.7.so.1.0) ==19533== ==19533== Conditional jump or move depends on uninitialised value(s) ==19533==at 0x140DAB78: component_select (osc_sm_component.c:352) ==19533==by 0xD9BA0B2: ompi_osc_base_select (osc_base_init.c:73) ==19533==by 0xD93174D: ompi_win_allocate_shared (win.c:213) ==19533==by 0xD982FD0: PMPI_Win_allocate_shared (pwin_allocate_shared.c:80) ==19533==by 0xD62C727: __pyx_pw_6mpi4py_3MPI_3Win_13Allocate_shared (mpi4py.MPI.c:109409) ==19533==by 0x38442E0BD3: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442E21EC: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442E22F1: PyEval_EvalCode (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F20DB: PyImport_ExecCodeModuleEx (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F2357: ??? (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F2FF0: ??? (in /usr/lib64/libpython2.7.so.1.0) ==19533==by 0x38442F323C: ??? (in /usr/lib64/libpython2.7.so.1.0) -- Lisandro Dalcin --- CIMEC (UNL/CONICET) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1016) Tel/Fax: +54-342-4511169 diff -up ompi/mca/osc/sm/osc_sm_component.c.orig ompi/mca/osc/sm/osc_sm_component.c --- ompi/mca/osc/sm/osc_sm_component.c.orig 2014-04-24 10:28:58.790702380 +0300 +++ ompi/mca/osc/sm/osc_sm_component.c 2014-04-24 10:30:15.138137733 +0300 @@ -341,7 +341,7 @@ component_select(struct ompi_win_t *win, #if HAVE_PTHREAD_CONDATTR_SETPSHARED && HAVE_PTHREAD_MUTEXATTR_SETPSHARED pthread_mutexattr_t mattr; pthread_condattr_t cattr; -bool blocking_fence; +bool blocking_fence = false; int flag; if (OMPI_SUCCESS != ompi_info_get_bool(info, "blocking_fence", @@ -349,7 +349,7 @@ component_select(struct ompi_win_t *win, goto error; } -if (blocking_fence) { +if (flag && blocking_fence) { ret = pthread_mutexattr_init(&mattr); ret = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED); if (ret != 0) {
[OMPI devel] RFC: Well-known mca parameters
WHAT: * Formalize well-known MCA parameters that can be used by any component to represent external dependencies for this component. * Component can set that well-known MCA r/o parameters to expose to the end-users different setup related traits of the OMPI installation. Example: ompi_info can print for every component which depends on external library: - ext lib runtime version used by component - ext lib compiletime version used by component slurm: v2.6.6 mtl/mxm: v2.5 btl/verbs: v3.2 btl/usnic: v1.1 coll/fca: v2.5 ... End-user or site admin or OMPI vendor can aggregate this information by some script and generate report if given installation compiles with site/vendor rules. * The "well-known" mca parameters can be easily extracted from ALL components by grep-like utilities. * Current proposal: ** prefix each well-known MCA param with "print_": ** Define two well-known mca parameters indicating external library runtime and compiletime versions, i.e.: print_compiletime_version print_runtime_version The following command will show all exposed well-known mca params from all components: ompi_info --parsable -l 9 |grep ":print_" WHY: * Better support-ability: site/vendor can provide script which will check if OMPI installation complies to release notes or support matrix. WHEN: - Next teleconf - code can be observed here: https://svn.open-mpi.org/trac/ompi/ticket/4556 Comments?
[OMPI devel] MPI_Recv_init_null_c from intel test suite fails vs ompi trunk
Folks, Here is attached an oversimplified version of the MPI_Recv_init_null_c test from the intel test suite. the test works fine with v1.6, v1.7 and v1.8 branches but fails with the trunk. i wonder wether the bug is in OpenMPI or the test itself. on one hand, we could consider there is a bug in OpenMPI : status.MPI_SOURCE should be MPI_PROC_NULL since we explicitly posted a recv request with MPI_PROC_NULL. on the other hand, (mpi specs, chapter 3.7.3 and https://svn.open-mpi.org/trac/ompi/ticket/3475) we could consider the returned value is not significant, and hence MPI_Wait should return an empty status (and empty status has source=MPI_ANY_SOURCE per the MPI specs). for what it's worth, this test is a success with mpich (e.g. status.MPI_SOURCE is MPI_PROC_NULL). what is the correct interpretation of the MPI specs and what should be done ? (e.g. fix OpenMPI or fix/skip the test ?) Cheers, Gilles /* * This test program is an over simplified version of the * MPI_Recv_init_null_c test from the intel test suite. * * It can be ran on one task : * mpirun -np 1 -host localhost ./a.out * * when ran on the trunk, since r28431, the test will fail : * status.MPI_SOURCE is MPI_ANY_SOURCE instead of MPI_PROC_NULL * * Copyright (c) 2014 Research Organization for Information Science * and Technology (RIST). All rights reserved. */ #include #include int main (int argc, char *argv[]) { MPI_Status status; MPI_Request req; int ierr; MPI_Init(&argc, &argv); ierr = MPI_Recv_init(NULL, 0, MPI_INT, MPI_PROC_NULL, MPI_ANY_TAG, MPI_COMM_WORLD, &req); if (ierr != MPI_SUCCESS) MPI_Abort(MPI_COMM_WORLD, 1); ierr = MPI_Start(&req); if (ierr != MPI_SUCCESS) MPI_Abort(MPI_COMM_WORLD, 2); ierr = MPI_Wait(&req, &status); if (ierr != MPI_SUCCESS) MPI_Abort(MPI_COMM_WORLD, 3); if (MPI_PROC_NULL != status.MPI_SOURCE) { if (MPI_ANY_SOURCE == status.MPI_SOURCE) { printf("got MPI_ANY_SOURCE=%d instead of MPI_PROC_NULL=%d\n", status.MPI_SOURCE, MPI_PROC_NULL); } else { printf("got %d instead of MPI_PROC_NULL=%d\n", status.MPI_SOURCE, MPI_PROC_NULL); } } else { printf("OK\n"); } MPI_Finalize(); return 0; }