Re: [OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers
Josh, Paul, the problem with old PGI compilers comes from the preprocessor (!) with pgi 12.10 : oshmem/shmem/fortran/start_pes_f.c SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) gets expanded as #pragma weak START_PES = PSTART_PES SHMEM_GENERATE_WEAK_PRAGMA ( weak start_pes_ = pstart_pes_ ) whereas with pgi 14.7, it gets expanded as #pragma weak START_PES = PSTART_PES #pragma weak start_pes_ = pstart_pes_ #pragma weak start_pes__ = pstart_pes__ from oshmem/shmem/fortran/profile/pbindings.h : #define SHMEM_GENERATE_WEAK_PRAGMA(x) _Pragma(#x) #define SHMEM_GENERATE_WEAK_BINDINGS(UPPER_NAME, lower_name) \ SHMEM_GENERATE_WEAK_PRAGMA(weak UPPER_NAME = P ## UPPER_NAME)\ SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## _ = p ## lower_name ## _) \ SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## __ = p ## lower_name ## __) a workaround is to manually expand the SHMEM_GENERATE_WEAK_BINDINGS macro and replace SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) with SHMEM_GENERATE_WEAK_PRAGMA(weak START_PES = PSTART_PES) SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes_ = pstart_pes_) SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes__ = pstart_pes__) /* i was unable to get something that works by simply replacing the definition of the SHMEM_GENERATE_WEAK_BINDINGS macro */ of course, this would have to be repeated in all the source files ... Cheers, Gilles On 2014/08/15 3:44, Paul Hargrove wrote: > Josh, > > The specific compilers that caused the most problems are the older PGI > compilers (any before 13.x). > In this case the user has the option to update their compiler to 13.10 or > newer. > > There are also issues with IBM's xlf. > For the IBM compiler I have never found a version that builds/links the MPI > f08 bindings, and now also find no version that can link the OSHMEM fortran > bindings. > > -Paul > > -Paul > > > On Thu, Aug 14, 2014 at 11:30 AM, Joshua Ladd wrote: > >> We will update the README accordingly. Thank you, Paul. >> >> Josh >> >> >> On Thu, Aug 14, 2014 at 10:00 AM, Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> >>> Good points. >>> >>> Mellanox -- can you update per Paul's suggestions? >>> >>> >>> On Aug 13, 2014, at 8:26 PM, Paul Hargrove wrote: >>> The following is NOT a bug report. This is just an observation that may deserve some text in the README. I've reported issues in the past with some Fortran compilers (mostly >>> older XLC and PGI) which either cannot build the "use mpi_f08" module, or >>> cannot correctly link to it (and sometimes this fails only if configured >>> with --enable-debug). Testing the OSHMEM Fortran bindings (enabled by default on Linux) I >>> have found several compilers which fail to link the examples >>> (hello_oshmemfh and ring_oshmemfh). I reported one specific instance (with >>> xlc-11/xlf-13) back in February: >>> http://www.open-mpi.org/community/lists/devel/2014/02/14057.php So far I have these failures only on platforms where the Fortran >>> compiler is *known* to be broken for the MPI f90 and/or f08 bindings. >>> Specifically, all the failing platforms are ones on which either: + Configure determines (without my help) that FC cannot build the F90 >>> and/or F08 modules. OR + I must pass --enable-mpi-fortran=usempi or --enable-mpi-fortran=mpifh >>> for cases configure cannot detect. So, I do *not* believe there is anything wrong with the OSHMEM code, >>> which is why I started this post with "The following is NOT a bug report". >>> However, I have two recommendations to make: 1) Documentation The README says just: --disable-oshmem-fortran Disable building only the Fortran OSHMEM bindings. So, I recommend adding a sentence there referencing the "Compiler >>> Notes" section of the README which has details on some known bad Fortran >>> compilers. 2) Configure: As I noted above, at least some of the failures are on platforms where >>> configure has determined it cannot build the f08 MPI bindings. So, maybe >>> there is something that could be done at configure time to disqualify some >>> Fortran compilers from building the OSHMEM fotran bindings, too. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/08/15643.php >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> ___ >>> devel mailing list >>> de.
Re: [OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers
In the case of PGI compilers prior to 13, a workaround is to configure with --disable-oshmem-profile On 2014/08/18 16:21, Gilles Gouaillardet wrote: > Josh, Paul, > > the problem with old PGI compilers comes from the preprocessor (!) > > with pgi 12.10 : > oshmem/shmem/fortran/start_pes_f.c > SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) > > gets expanded as > > #pragma weak START_PES = PSTART_PES SHMEM_GENERATE_WEAK_PRAGMA ( weak > start_pes_ = pstart_pes_ ) > > whereas with pgi 14.7, it gets expanded as > > #pragma weak START_PES = PSTART_PES > #pragma weak start_pes_ = pstart_pes_ > #pragma weak start_pes__ = pstart_pes__ > > from oshmem/shmem/fortran/profile/pbindings.h : > #define SHMEM_GENERATE_WEAK_PRAGMA(x) _Pragma(#x) > > #define SHMEM_GENERATE_WEAK_BINDINGS(UPPER_NAME, > lower_name) \ > SHMEM_GENERATE_WEAK_PRAGMA(weak UPPER_NAME = P ## > UPPER_NAME)\ > SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## _ = p ## lower_name ## > _) \ > SHMEM_GENERATE_WEAK_PRAGMA(weak lower_name ## __ = p ## lower_name > ## __) > > a workaround is to manually expand the SHMEM_GENERATE_WEAK_BINDINGS > macro and replace > > SHMEM_GENERATE_WEAK_BINDINGS(START_PES, start_pes) > > with > > SHMEM_GENERATE_WEAK_PRAGMA(weak START_PES = PSTART_PES) > SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes_ = pstart_pes_) > SHMEM_GENERATE_WEAK_PRAGMA(weak start_pes__ = pstart_pes__) > > /* i was unable to get something that works by simply replacing the > definition of the SHMEM_GENERATE_WEAK_BINDINGS macro */ > > of course, this would have to be repeated in all the source files ... > > > Cheers, > > Gilles > > On 2014/08/15 3:44, Paul Hargrove wrote: >> Josh, >> >> The specific compilers that caused the most problems are the older PGI >> compilers (any before 13.x). >> In this case the user has the option to update their compiler to 13.10 or >> newer. >> >> There are also issues with IBM's xlf. >> For the IBM compiler I have never found a version that builds/links the MPI >> f08 bindings, and now also find no version that can link the OSHMEM fortran >> bindings. >> >> -Paul >> >> -Paul >> >> >> On Thu, Aug 14, 2014 at 11:30 AM, Joshua Ladd wrote: >> >>> We will update the README accordingly. Thank you, Paul. >>> >>> Josh >>> >>> >>> On Thu, Aug 14, 2014 at 10:00 AM, Jeff Squyres (jsquyres) < >>> jsquy...@cisco.com> wrote: >>> Good points. Mellanox -- can you update per Paul's suggestions? On Aug 13, 2014, at 8:26 PM, Paul Hargrove wrote: > The following is NOT a bug report. > This is just an observation that may deserve some text in the README. > > I've reported issues in the past with some Fortran compilers (mostly older XLC and PGI) which either cannot build the "use mpi_f08" module, or cannot correctly link to it (and sometimes this fails only if configured with --enable-debug). > Testing the OSHMEM Fortran bindings (enabled by default on Linux) I have found several compilers which fail to link the examples (hello_oshmemfh and ring_oshmemfh). I reported one specific instance (with xlc-11/xlf-13) back in February: http://www.open-mpi.org/community/lists/devel/2014/02/14057.php > So far I have these failures only on platforms where the Fortran compiler is *known* to be broken for the MPI f90 and/or f08 bindings. Specifically, all the failing platforms are ones on which either: > + Configure determines (without my help) that FC cannot build the F90 and/or F08 modules. > OR > + I must pass --enable-mpi-fortran=usempi or --enable-mpi-fortran=mpifh for cases configure cannot detect. > So, I do *not* believe there is anything wrong with the OSHMEM code, which is why I started this post with "The following is NOT a bug report". However, I have two recommendations to make: > 1) Documentation > > The README says just: > > --disable-oshmem-fortran > Disable building only the Fortran OSHMEM bindings. > > So, I recommend adding a sentence there referencing the "Compiler Notes" section of the README which has details on some known bad Fortran compilers. > 2) Configure: > > As I noted above, at least some of the failures are on platforms where configure has determined it cannot build the f08 MPI bindings. So, maybe there is something that could be done at configure time to disqualify some Fortran compilers from building the OSHMEM fotran bindings, too. > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to t
Re: [OMPI devel] RFC: BTL Interface Change (2 of 5)
Nathan, Indeed the original design allowed for multiple usages of the same descriptor, not concurrent as the text in the btl.h indicates but consecutive. The MCA_BTL_FLAGS_RDMA_MATCHED flag is a weirdness needed for Portal, and I am not use it is currently in use anywhere in the code base. My problem with the depicted approach is that now we have two critical sections in the fast path: one to allocate/reserve the descriptor (this is at the BTL level on a call from the PML), and then another one to allocate whatever structure the BTL needs to store the callback informations (again on a call from the PML to the BTL). In the previous design, we carefully analyzed all communications path and tried to minimize the number of back-and-forth between the PML and BTL layer in order to preserve the performance. George. On Thu, Jul 10, 2014 at 2:57 PM, Nathan Hjelm wrote: > > What: Change the descriptor completion callback function to include > cbdata and context pointers. > > Old callback: > > typedef void (*mca_btl_base_completion_fn_t)( > struct mca_btl_base_module_t* module, > struct mca_btl_base_endpoint_t* endpoint, > struct mca_btl_base_descriptor_t* descriptor, > int status); > > > New callback: > > typedef void (*mca_btl_base_completion_fn_t)( > struct mca_btl_base_module_t* module, > struct mca_btl_base_endpoint_t* endpoint, > struct mca_btl_base_descriptor_t* descriptor, > void *cbdata, void *context, int status); > > > Why: The BTL interface provides support for using a single descriptor > with multiple concurrent RDMA operations. BTLs support this feature if > the following flag is not set: > > /** RDMA put/get calls must have a matching prepare_{src,dst} call > on the target with the same base (and possibly bound). */ > #define MCA_BTL_FLAGS_RDMA_MATCHED0x0040 > > > The problem is that in order to pass back the correct callback data and > context to the completion function BTLs need to modify the > descriptor. This could be a disaster in a multi-threaded application if > one thread is calling the completion callback while another thread is > preparing to start a put/get operation. To avoid issues it is better to > provide the callback data as seperate arguments. > > The change is straightforward and the commit will update all BTLs and > BTL users to use the new completion callback signature. > > > When: As this was discussed at the developer's meeting last month I am > setting a short timeout for this RFC. This times out next Wed (July > 16). > > > I would really like feedback on this change. Can it be improved? Should > the segment data be passed back to the function (not something I need > for RMA but might be useful elsewhere)? Would it be better to remove the > simultaneous RDMA feature in favor of a lightweight descriptor clone (I > have this implemented as well and I have no problem with providing > both features)? > > > This is another is a series of RFCs to improve (I hope) the BTL > interface for one-sided operations. The next RFC will be on the > one-sided BTL interface. > > -Nathan Hjelm > HPC-5, LANL > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/07/15101.php >