Hi David,

My assumption was that users would/should call MPI_Finalize() as the last thing 
in their program. 
If you read the MPI 3.0 standard, p:361, line 28, MPI implementations may 
guarantee that only process 0 returns from the call. Granted that you can use 
process 0 to do HDF5 serial I/O.

>From the HDF5 side, we shouldn't make MPI calls if MPI is not there and the 
>user is doing serial I/O, so yes the checks should include both is_initialized 
>and is_finalized. I will enter a bug report for this, but again, as per the 
>standard, I advise against doing anything after MPI_Finalize() is called.

Thanks,
Mohamad

-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of 
David A. Schneider
Sent: Monday, June 08, 2015 12:02 PM
To: [email protected]
Subject: [Hdf-forum] hdf5 1.8.15 issue with calling MPI_Comm_create_keyval 
after MPI_Finalized called

We are trying to update our installation of hdf5 from 1.8.14 to 1.8.15. 
I have a sequence of unit tests that repeatedly run a program with different 
parameters, checking the output of the program. This program is a C++ program 
that uses the hdf5 library in a serial manner (it doesn't use the parallel 
features of the library) but it also uses MPI. 
The unit tests are written in Python, and they may use h5py to check the output 
of the program. The unit tests will run the program using mpiexec locally with 
a few ranks. We have openmpi 1.8.1 installed. What I'm finding with hdf5 
1.8.15, is that although each unit test seems to succeed, the whole script 
fails and I get some output:

*** The MPI_Comm_create_keyval() function was called after MPI_FINALIZE was 
invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[(null):16407] Local abort after MPI_FINALIZE completed successfully; not able 
to aggregate error messages, and not able to guarantee that all other processes 
were killed!
HDF5: infinite loop closing library
E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E

When I look through the hdf5 1.8.15 source code, I find in H5.c, the
H5_init_library(void) function, that it does

     int mpi_initialized;
     MPI_Initialized(&mpi_initialized);
     if (mpi_initialized) {
         int key_val;
         if(MPI_SUCCESS != (mpi_code =
MPI_Comm_create_keyval(MPI_NULL_COPY_FN, ...

I'm not sure that this is correct. I wrote a small program that did

   MPI_Init(&argc, &argv);
   MPI_Finalize();
   MPI_Initialized(&mpi_initialized);
   MPI_Finalized(&mpi_finalized);
   std::cout << "MPI_Finalize. mpi_initialized=" << mpi_initialized << " 
mpi_finalized=" << mpi_finalized << std::endl;

and I found that both mpi_initialized and mpi_finalized were 1 after I called 
MPI_Finalize. That is, I think H5_init_library() should check to see if 
MPI_Finalize has been called as well. I'm not sure what hdf5 call is triggering 
the error messages, it may happen when h5py unloads (we have h5py version 
2.3.1).

I did try some simple programs that looked like


   MPI_Init(&argc, &argv);
   MPI_Finalize();
   H5Fcreate // or some other simple h5 function calls

and I couldn't recreate those error messages, so maybe there is something else 
wrong.

best,

David Schneider
software engineer
LCLS, SLAC



_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to