Paul, Any chance you can provide us with the example code that demonstrates the problem? If so, could you please mail it to [email protected]? We will enter a bug report and will take a look. It will also help if you can indicate OS, compiler version and MPI I/O version.
Thank you! Elena On Apr 20, 2010, at 8:29 AM, Paul Hilscher wrote: > Dear all, > > I have tried to fix this following problem since more than 3 months but still > did not succeeded, I hope > some of you gurus could help me out. > > I am using HDF5 to store my results from a plasma turbulence code (basically > 6-D and 3-D data, > and a table (to store several scalar data). In a single CPU run, HDF5 (and > parallel HDF5) works fine > but for a larger CPU number (and large amount of data output steps) I got the > following error message > at the end of the simulation when I want to close the HDF5 file : > > > ********* snip **** > > HDF5-DIAG: Error detected in HDF5 (1.8.4-patch1) MPI-process 24: > #000: H5F.c line 1956 in H5Fclose(): decrementing file ID failed > major: Object atom > minor: Unable to close file > #001: H5F.c line 1756 in H5F_close(): can't close file > major: File accessability > minor: Unable to close file > #002: H5F.c line 1902 in H5F_try_close(): unable to flush cache > major: Object cache > minor: Unable to flush data from cache > #003: H5F.c line 1681 in H5F_flush(): unable to flush metadata cache > major: Object cache > minor: Unable to flush data from cache > #004: H5AC.c line 950 in H5AC_flush(): Can't flush. > major: Object cache > minor: Unable to flush data from cache > #005: H5AC.c line 4695 in H5AC_flush_entries(): Can't propagate clean > entries list. > major: Object cache > minor: Unable to flush data from cache > #006: H5AC.c line 4450 in > H5AC_propagate_flushed_and_still_clean_entries_list(): Can't receive and/or > process clean slist broadcast. > major: Object cache > minor: Internal error detected > #007: H5AC.c line 4595 in H5AC_receive_and_apply_clean_list(): Can't mark > entries clean. > major: Object cache > minor: Internal error detected > #008: H5C.c line 5150 in H5C_mark_entries_as_clean(): Listed entry not in > cache?!?!?. > major: Object cache > minor: Internal error detected > ^[[0mHDF5: infinite loop closing library > > D,G,A,S,T,F,F,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD > > > ****** snap *** > > I get this error message deterministically, if I increase the data output > frequency, (or CPU number). Finally I cannot open > this file anymore, because HDF5 complains it is corrupted (sure, because it > was not probably closed). > I get the same error on different computers ( with different environment, > e.g. compiler, openmpi library, distribution). > Any Idea to fix this problem is highly appreciated. > > > Thanks for your help & time > > Paul > > > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
