Re: [OMPI users] Regression: Fortran derived types with newer MPI module
Yes, I can explain what's going on here. The short version is that a change was made with the intent to provide maximum Fortran code safety, but with a possible backwards compatibility issue. If this change is causing real problems, we can probably change this, but I'd like a little feedback from the Fortran MPI dev community first. It's a complex issue, and requires a little background and discussion, sorry... a) Back in the 1.6.x series, we allowed users to build multiple variants of the "use mpi" Fortran interface: - tiny: only the MPI_SIZEOF subroutine - small: tiny + all MPI subroutines that did not take choice buffers, and the MPI functions (WTICK, WTIME) - medium: small + all MPI subroutines that take 1 choice buffer (e.g., MPI_SEND) - large: all MPI subroutines (even those that take 2 choice buffers, such as collectives) --> Note: the "large" size never really worked for uninteresting reasons. It won't be fixed. See the OMPI 1.6.x README for more details. The default is "small" in the 1.6.x series. This means that when you call MPI_SEND (and any other function that takes a choice buffer), you are not getting an MPI-implementation-provided prototype for that function -- it's essentially the same as how everyone has implemented mpif.h (i.e., no prototypes). This is why you are able to compile your code in OMPI 1.6.x with "use mpi" -- because there is no prototype for MPI_SEND in the mpi module. Heck, you could even: ! Don't pass enough params to MPI_SEND call MPI_Send(bogus) and it would compile and link. It will likely segv at run time, but that's a different issue. b) I *believe* that MPICH does the equivalent of "tiny", but I'm not going to swear to that (meaning: you're not getting any prototypes for any MPI subroutines other than MPI_SIZEOF). This is why you are able to compile your code with MPICH and "use mpi" -- same disclaimers as a) (i.e., you get no compile-time protection for when you don't call MPI subroutines properly). c) The design of the MPI-2 "mpi" module has multiple flaws that are identified in the MPI-3 text (but were not recognized back in MPI-2.x days). Here's one: until F2008+addendums, there was no Fortran equivalent of "void *". Hence, the mpi module has to overload MPI_Send() and have a prototype *for every possible type and dimension*. The OMPI "medium" implementation actually provides overloaded prototypes for all pre-defined Fortran datatypes (INTEGER, REAL, ...etc.), for scalars and, by default, array ranks up to 4. Fortran <2003 allows up to... er... I think ?7?... dimensional arrays, but providing an overloaded interface for each scalar type and all array dimensions for each type explodes the number of overloaded prototypes in the mpi module; most compilers that we tested several years ago would segv with this many interfaces in a single module. It gets worse with the MPI subroutines that take multiple choice buffers: you get an exponential explosion of interfaces. IIRC, a fully-populated mpi "large" module would contain over 5 million interfaces. Craig Rasmussen and I wrote a paper about this in EuroMPI 2005 (http://www.open-mpi.org/papers/euro-pvmmpi-2005-fortran/). It was one of the issues that eventually led to the creation of the MPI-3 mpi_f08 module. Here's another fatal flaw: it's not possible for an MPI implementation to provide MPI_Send() prototypes for user-defined Fortran datatypes. Hence, the example you cite is a pipe dream for the "mpi" module because there's no way to specify a (void*)-like argument for the choice buffer. Meaning: Fortran MPI apps can either have compile-time safety or user-defined datatypes as choice buffers. Pick one. d) A solution to the problems listed in c) is to use non-standard, compiler-specific "ignore TKR" functionality in the mpi module implementation, which effectively provides (void*) functionality. Hence, an implementation can have a *single* MPI_SEND subroutine interface, and use a pragma to ignore the type, kind, and rank of the choice buffer parameter. OMPI 1.7 and beyond actually has 2 different implementations of the mpi module: - the old tiny/small/medium-based interface for compilers that do not support "ignore TKR" pragmas (i.e., gfortran) - a new ignore-TKR-based module that prototypes all MPI subroutines and functions Meaning: OMPI 1.7 with non-gfortran works great (i.e., your sample code compiles). OMPI 1.7 with gfortran is *mostly* the same as it was in 1.6, except that we changed the default from "small" to "medium". *** This is what is causing your problem. In OMPI 1.6, we didn't provide an interface for MPI_SEND by default. In OMPI 1.7, we do. Craig Rasmussen and I debated long and hard about whether to change the default from "small" to "medium" or not. We finally ended up doing it with the following rationale: - very few codes use the "mpi" module - but those who do should have the maximum amount of compile-tim
Re: [OMPI users] Regression: Fortran derived types with newer MPI module
On Jan 7, 2014, at 10:09 PM, Jeff Squyres wrote: > - the old tiny/small/medium-based interface for compilers that do not support > "ignore TKR" pragmas (i.e., gfortran) And of course, as soon as I say this publicly, I see a post from Tobias reminding me that gcc/gfortran 4.9 will include their own ignore TKR functionality (and he reminds me to include the patch to support it in OMPI... which I'll do for trunk/1.7.5). :-) This means that you'll get a good "mpi" module with gfortran 4.9 (i.e., all MPI subroutines and functions are prototyped, and (void*)-like functionality is supported). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Regression: Fortran derived types with newer MPI module
"Jeff Squyres (jsquyres)" writes: > Yes, I can explain what's going on here. The short version is that a > change was made with the intent to provide maximum Fortran code > safety, but with a possible backwards compatibility issue. If this > change is causing real problems, we can probably change this, but I'd > like a little feedback from the Fortran MPI dev community first. On page 610, I see text disallowing the explicit interfaces in ompi/mpi/fortran/use-mpi-tkr: In S2 and S3: Without such extensions, routines with choice buffers should be provided with an implicit interface, instead of overloading with a different MPI function for each possible buffer type (as mentioned in Section 17.1.11 on page 625). Such overloading would also imply restrictions for passing Fortran derived types as choice buffer, see also Section 17.1.15 on page 629. Why did OMPI decide that this (presumably non-normative) text in the standard was not worth following? (Rejecting something in the standard indicates stronger convictions than would be independently weighing the benefits of each approach.) > c) The design of the MPI-2 "mpi" module has multiple flaws that are > identified in the MPI-3 text (but were not recognized back in MPI-2.x > days). Here's one: until F2008+addendums, there was no Fortran > equivalent of "void *". Hence, the mpi module has to overload > MPI_Send() and have a prototype *for every possible type and > dimension*. And this is not possible, thus the text saying not to do it. I don't call MPI from Fortran, but someone on a Fortran project that I watch mentioned that the compiler would complain about such and such a use (actually relating to types for MPI_Status in MPI_Recv rather than buffer types). My immediate response was "they can't do that because without nonstandard or post-F08 extensions (or exposing the user to c_loc), the type system cannot express those functions and thus you cannot have explicit interfaces". But then I looked at latest OMPI and indeed, it was enumerating types, thus my email. > Here's another fatal flaw: it's not possible for an MPI implementation > to provide MPI_Send() prototypes for user-defined Fortran datatypes. > Hence, the example you cite is a pipe dream for the "mpi" module > because there's no way to specify a (void*)-like argument for the > choice buffer. F2003 has c_loc, which is a sufficient stop-gap until TS 29113 is widely available. I have long-advocated that the best way to write extensible libraries for Fortran2003 callers (even if the library is implemented entirely in Fortran) involves some use of c_loc (e.g., for context arguments). This annoys the Fortran programmers and they usually write perl scripts to generate interfaces that enumerate the types they need and give up on extensibility. ;-) It's nice to know that after 60 years (when Fortran 201x is released, including TS 29113), there will be a Fortran standard with an analogue of void*. > Craig Rasmussen and I debated long and hard about whether to change > the default from "small" to "medium" or not. We finally ended up > doing it with the following rationale: > > - very few codes use the "mpi" module FWIW, I've noticed a few projects transition to it in the last few years. > - but those who do should have the maximum amount of compile-time protection > > ...but we always knew that someone may come complaining some day. And that > day has now come. > > So my question to you / the Fortran MPI dev community is: what do you want > (for gfortran)? > > Do you want us to go back to the "small" size by default, or do you > want more compile-time protection by default? (with the obvious > caveat that you can't use user-defined Fortran datatypes as choice > buffers; you might be able to use something like c_loc, but I haven't > thought deeply about this and don't know offhand if that works) I can't answer this as a Fortran developer, but I know that a lot of projects want some modicum of portability and in practice, it takes almost 10 years to flush the old compilers out of production environments. Either the upgrade problem will need to be fixed [1] so that nearly all existing machines have new compilers or Fortran projects will be wrestling with this for a long time yet. Most Fortran packages I know use homogeneous arrays, which also means that they don't call MPI_Type_create_struct or similar functions. If those functions are going to be provided by the module, I think they should be able to use them (e.g., examples in the Standard should work) and the Standard's advice about implicit interfaces should be followed. [1] Also, there are still production machines without MPI-2.0 and I get email if I make a mistake in providing MPI-1 fallback paths. pgp4Mn5eAmbuu.pgp Description: PGP signature