Re: [OMPI devel] RFC: BTL Interface Change (2 of 5)
Nathan, Indeed the original design allowed for multiple usages of the same descriptor, not concurrent as the text in the btl.h indicates but consecutive. The MCA_BTL_FLAGS_RDMA_MATCHED flag is a weirdness needed for Portal, and I am not use it is currently in use anywhere in the code base. My problem with the depicted approach is that now we have two critical sections in the fast path: one to allocate/reserve the descriptor (this is at the BTL level on a call from the PML), and then another one to allocate whatever structure the BTL needs to store the callback informations (again on a call from the PML to the BTL). In the previous design, we carefully analyzed all communications path and tried to minimize the number of back-and-forth between the PML and BTL layer in order to preserve the performance. George. On Thu, Jul 10, 2014 at 2:57 PM, Nathan Hjelmwrote: > > What: Change the descriptor completion callback function to include > cbdata and context pointers. > > Old callback: > > typedef void (*mca_btl_base_completion_fn_t)( > struct mca_btl_base_module_t* module, > struct mca_btl_base_endpoint_t* endpoint, > struct mca_btl_base_descriptor_t* descriptor, > int status); > > > New callback: > > typedef void (*mca_btl_base_completion_fn_t)( > struct mca_btl_base_module_t* module, > struct mca_btl_base_endpoint_t* endpoint, > struct mca_btl_base_descriptor_t* descriptor, > void *cbdata, void *context, int status); > > > Why: The BTL interface provides support for using a single descriptor > with multiple concurrent RDMA operations. BTLs support this feature if > the following flag is not set: > > /** RDMA put/get calls must have a matching prepare_{src,dst} call > on the target with the same base (and possibly bound). */ > #define MCA_BTL_FLAGS_RDMA_MATCHED0x0040 > > > The problem is that in order to pass back the correct callback data and > context to the completion function BTLs need to modify the > descriptor. This could be a disaster in a multi-threaded application if > one thread is calling the completion callback while another thread is > preparing to start a put/get operation. To avoid issues it is better to > provide the callback data as seperate arguments. > > The change is straightforward and the commit will update all BTLs and > BTL users to use the new completion callback signature. > > > When: As this was discussed at the developer's meeting last month I am > setting a short timeout for this RFC. This times out next Wed (July > 16). > > > I would really like feedback on this change. Can it be improved? Should > the segment data be passed back to the function (not something I need > for RMA but might be useful elsewhere)? Would it be better to remove the > simultaneous RDMA feature in favor of a lightweight descriptor clone (I > have this implemented as well and I have no problem with providing > both features)? > > > This is another is a series of RFCs to improve (I hope) the BTL > interface for one-sided operations. The next RFC will be on the > one-sided BTL interface. > > -Nathan Hjelm > HPC-5, LANL > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/07/15101.php >
Re: [hwloc-devel] --enable-plugins broken
Good enough; thanks for the refresher. :) Sent from my phone. No type good. > On Aug 18, 2014, at 2:07 PM, "Brice Goglin"wrote: > > Le 18/08/2014 20:37, Jeff Squyres (jsquyres) a écrit : >> I notice that --enable-plugins seems to be broken -- it always ends in: >> >> configure: WARNING: Plugin support requested, but could not find ltdl.h >> configure: error: Cannot continue >> >> if you don't have libltdl installed. Is this intentional? I.e., have we >> already relied on an external libltdl? Or have we previously embedded >> libltdl (via LT_CONFIG_LTDL_DIR), and it has just bit-rotted? > > We had both external and embedded ltdl support in the beginning. We > removed embedded in 1.7.1. > Brice > > > commit 7491172a4754b0e198f699cb31b7c65c59ac4df6 > Author: Brice Goglin > Date: Wed May 15 08:15:49 2013 + > >Stop embedding libltdl, always use the system-wide libltdl > >The ltdl embedding caused problems to the hwloc embedding such as > http://www.open-mpi.org/community/lists/hwloc-devel/2013/04/3659.php >We fixed the embedding AM_CONDITIONAL problem in > https://svn.open-mpi.org/trac/hwloc/changeset/5605 >but the generated tarballs now (sometimes) miss libltdl, >causing configure to break. >The patch in the first link above worked around that issue but... > >* Embedding ltdl is useful when you rely on recent ltdl features, > while ltdl 1.5 (couldn't test earlier) looks OK for hwloc, > and that version is very old and available everywhere. >* the ltdl embedding ability isn't perfect in "recursive" mode > (we had a hack for its config.h file in our configure > see http://lists.gnu.org/archive/html/libtool/2012-08/msg00016.html) >* we only need ltdl when --enable-plugins (not by default) > >That's enough reasons to consider not embedding ltdl anymore. >We now require people to install libltdl-dev/libtool-ltdl-dev >if they want plugins. > >This commit was SVN r5618. > > > ___ > hwloc-devel mailing list > hwloc-de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel > Link to this post: > http://www.open-mpi.org/community/lists/hwloc-devel/2014/08/4176.php
Re: [hwloc-devel] --enable-plugins broken
Le 18/08/2014 20:37, Jeff Squyres (jsquyres) a écrit : > I notice that --enable-plugins seems to be broken -- it always ends in: > > configure: WARNING: Plugin support requested, but could not find ltdl.h > configure: error: Cannot continue > > if you don't have libltdl installed. Is this intentional? I.e., have we > already relied on an external libltdl? Or have we previously embedded > libltdl (via LT_CONFIG_LTDL_DIR), and it has just bit-rotted? > We had both external and embedded ltdl support in the beginning. We removed embedded in 1.7.1. Brice commit 7491172a4754b0e198f699cb31b7c65c59ac4df6 Author: Brice GoglinList-Post: hwloc-devel@lists.open-mpi.org Date: Wed May 15 08:15:49 2013 + Stop embedding libltdl, always use the system-wide libltdl The ltdl embedding caused problems to the hwloc embedding such as http://www.open-mpi.org/community/lists/hwloc-devel/2013/04/3659.php We fixed the embedding AM_CONDITIONAL problem in https://svn.open-mpi.org/trac/hwloc/changeset/5605 but the generated tarballs now (sometimes) miss libltdl, causing configure to break. The patch in the first link above worked around that issue but... * Embedding ltdl is useful when you rely on recent ltdl features, while ltdl 1.5 (couldn't test earlier) looks OK for hwloc, and that version is very old and available everywhere. * the ltdl embedding ability isn't perfect in "recursive" mode (we had a hack for its config.h file in our configure see http://lists.gnu.org/archive/html/libtool/2012-08/msg00016.html) * we only need ltdl when --enable-plugins (not by default) That's enough reasons to consider not embedding ltdl anymore. We now require people to install libltdl-dev/libtool-ltdl-dev if they want plugins. This commit was SVN r5618.
[hwloc-devel] --enable-plugins broken
I notice that --enable-plugins seems to be broken -- it always ends in: configure: WARNING: Plugin support requested, but could not find ltdl.h configure: error: Cannot continue if you don't have libltdl installed. Is this intentional? I.e., have we already relied on an external libltdl? Or have we previously embedded libltdl (via LT_CONFIG_LTDL_DIR), and it has just bit-rotted? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers
In the case of PGI compilers prior to 13, a workaround is to configure with --disable-oshmem-profile On 2014/08/18 16:21, Gilles Gouaillardet wrote: > Josh, Paul, > > the problem with old PGI compilers comes from the preprocessor (!) > > with pgi 12.10 : > oshmem/shmem/fortran/start_pes_f.c > SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) > > gets expanded as > > #pragma weak START_PES = PSTART_PES SHMEM_GENERATE_WEAK_PRAGMA ( weak > start_pes_ = pstart_pes_ ) > > whereas with pgi 14.7, it gets expanded as > > #pragma weak START_PES = PSTART_PES > #pragma weak start_pes_ = pstart_pes_ > #pragma weak start_pes__ = pstart_pes__ > > from oshmem/shmem/fortran/profile/pbindings.h : > #define SHMEM_GENERATE_WEAK_PRAGMA(x) _Pragma(#x) > > #define SHMEM_GENERATE_WEAK_BINDINGS(UPPER_NAME, > lower_name) \ > SHMEM_GENERATE_WEAK_PRAGMA(weak UPPER_NAME = P ## > UPPER_NAME)\ > SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## _ = p ## lower_name ## > _) \ > SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## __ = p ## lower_name > ## __) > > a workaround is to manually expand the SHMEM_GENERATE_WEAK_BINDINGS > macro and replace > > SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) > > with > > SHMEM_GENERATE_WEAK_PRAGMA(weak START_PES = PSTART_PES) > SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes_ = pstart_pes_) > SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes__ = pstart_pes__) > > /* i was unable to get something that works by simply replacing the > definition of the SHMEM_GENERATE_WEAK_BINDINGS macro */ > > of course, this would have to be repeated in all the source files ... > > > Cheers, > > Gilles > > On 2014/08/15 3:44, Paul Hargrove wrote: >> Josh, >> >> The specific compilers that caused the most problems are the older PGI >> compilers (any before 13.x). >> In this case the user has the option to update their compiler to 13.10 or >> newer. >> >> There are also issues with IBM's xlf. >> For the IBM compiler I have never found a version that builds/links the MPI >> f08 bindings, and now also find no version that can link the OSHMEM fortran >> bindings. >> >> -Paul >> >> -Paul >> >> >> On Thu, Aug 14, 2014 at 11:30 AM, Joshua Laddwrote: >> >>> We will update the README accordingly. Thank you, Paul. >>> >>> Josh >>> >>> >>> On Thu, Aug 14, 2014 at 10:00 AM, Jeff Squyres (jsquyres) < >>> jsquy...@cisco.com> wrote: >>> Good points. Mellanox -- can you update per Paul's suggestions? On Aug 13, 2014, at 8:26 PM, Paul Hargrove wrote: > The following is NOT a bug report. > This is just an observation that may deserve some text in the README. > > I've reported issues in the past with some Fortran compilers (mostly older XLC and PGI) which either cannot build the "use mpi_f08" module, or cannot correctly link to it (and sometimes this fails only if configured with --enable-debug). > Testing the OSHMEM Fortran bindings (enabled by default on Linux) I have found several compilers which fail to link the examples (hello_oshmemfh and ring_oshmemfh). I reported one specific instance (with xlc-11/xlf-13) back in February: http://www.open-mpi.org/community/lists/devel/2014/02/14057.php > So far I have these failures only on platforms where the Fortran compiler is *known* to be broken for the MPI f90 and/or f08 bindings. Specifically, all the failing platforms are ones on which either: > + Configure determines (without my help) that FC cannot build the F90 and/or F08 modules. > OR > + I must pass --enable-mpi-fortran=usempi or --enable-mpi-fortran=mpifh for cases configure cannot detect. > So, I do *not* believe there is anything wrong with the OSHMEM code, which is why I started this post with "The following is NOT a bug report". However, I have two recommendations to make: > 1) Documentation > > The README says just: > > --disable-oshmem-fortran > Disable building only the Fortran OSHMEM bindings. > > So, I recommend adding a sentence there referencing the "Compiler Notes" section of the README which has details on some known bad Fortran compilers. > 2) Configure: > > As I noted above, at least some of the failures are on platforms where configure has determined it cannot build the f08 MPI bindings. So, maybe there is something that could be done at configure time to disqualify some Fortran compilers from building the OSHMEM fotran bindings, too. > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > ___ > devel mailing list > de...@open-mpi.org > Subscription:
Re: [OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers
Josh, Paul, the problem with old PGI compilers comes from the preprocessor (!) with pgi 12.10 : oshmem/shmem/fortran/start_pes_f.c SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) gets expanded as #pragma weak START_PES = PSTART_PES SHMEM_GENERATE_WEAK_PRAGMA ( weak start_pes_ = pstart_pes_ ) whereas with pgi 14.7, it gets expanded as #pragma weak START_PES = PSTART_PES #pragma weak start_pes_ = pstart_pes_ #pragma weak start_pes__ = pstart_pes__ from oshmem/shmem/fortran/profile/pbindings.h : #define SHMEM_GENERATE_WEAK_PRAGMA(x) _Pragma(#x) #define SHMEM_GENERATE_WEAK_BINDINGS(UPPER_NAME, lower_name) \ SHMEM_GENERATE_WEAK_PRAGMA(weak UPPER_NAME = P ## UPPER_NAME)\ SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## _ = p ## lower_name ## _) \ SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## __ = p ## lower_name ## __) a workaround is to manually expand the SHMEM_GENERATE_WEAK_BINDINGS macro and replace SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) with SHMEM_GENERATE_WEAK_PRAGMA(weak START_PES = PSTART_PES) SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes_ = pstart_pes_) SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes__ = pstart_pes__) /* i was unable to get something that works by simply replacing the definition of the SHMEM_GENERATE_WEAK_BINDINGS macro */ of course, this would have to be repeated in all the source files ... Cheers, Gilles On 2014/08/15 3:44, Paul Hargrove wrote: > Josh, > > The specific compilers that caused the most problems are the older PGI > compilers (any before 13.x). > In this case the user has the option to update their compiler to 13.10 or > newer. > > There are also issues with IBM's xlf. > For the IBM compiler I have never found a version that builds/links the MPI > f08 bindings, and now also find no version that can link the OSHMEM fortran > bindings. > > -Paul > > -Paul > > > On Thu, Aug 14, 2014 at 11:30 AM, Joshua Laddwrote: > >> We will update the README accordingly. Thank you, Paul. >> >> Josh >> >> >> On Thu, Aug 14, 2014 at 10:00 AM, Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> >>> Good points. >>> >>> Mellanox -- can you update per Paul's suggestions? >>> >>> >>> On Aug 13, 2014, at 8:26 PM, Paul Hargrove wrote: >>> The following is NOT a bug report. This is just an observation that may deserve some text in the README. I've reported issues in the past with some Fortran compilers (mostly >>> older XLC and PGI) which either cannot build the "use mpi_f08" module, or >>> cannot correctly link to it (and sometimes this fails only if configured >>> with --enable-debug). Testing the OSHMEM Fortran bindings (enabled by default on Linux) I >>> have found several compilers which fail to link the examples >>> (hello_oshmemfh and ring_oshmemfh). I reported one specific instance (with >>> xlc-11/xlf-13) back in February: >>> http://www.open-mpi.org/community/lists/devel/2014/02/14057.php So far I have these failures only on platforms where the Fortran >>> compiler is *known* to be broken for the MPI f90 and/or f08 bindings. >>> Specifically, all the failing platforms are ones on which either: + Configure determines (without my help) that FC cannot build the F90 >>> and/or F08 modules. OR + I must pass --enable-mpi-fortran=usempi or --enable-mpi-fortran=mpifh >>> for cases configure cannot detect. So, I do *not* believe there is anything wrong with the OSHMEM code, >>> which is why I started this post with "The following is NOT a bug report". >>> However, I have two recommendations to make: 1) Documentation The README says just: --disable-oshmem-fortran Disable building only the Fortran OSHMEM bindings. So, I recommend adding a sentence there referencing the "Compiler >>> Notes" section of the README which has details on some known bad Fortran >>> compilers. 2) Configure: As I noted above, at least some of the failures are on platforms where >>> configure has determined it cannot build the f08 MPI bindings. So, maybe >>> there is something that could be done at configure time to disqualify some >>> Fortran compilers from building the OSHMEM fotran bindings, too. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/08/15643.php >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>>