Hi Mohamad, On Thu, Jul 02, 2015 at 01:56:40PM +0000, Mohamad Chaarawi wrote: > I entered an issue for parallel support and improvements for > H5Ocopy() in our Jira database (HDFFV-9435), but to be honest, I am > not sure if we will have time to fix it for parallel unless someone > funds it, since this isn't a high priority feature at the moment.
Thank you for confirming my guess. I will keep this in mind in case I acquire funding of my own for a project using parallel HDF5. I use H5Ocopy to make atomic snapshots of output datasets during a simulation. The datasets have chunked layout with time-varying data and grow over the course of the simulation. If the simulation is interrupted, the output file is unreadable since HDF5 does not implement metadata journaling (yet?). To make a consistent snapshot, I create another HDF5 file with a temporary filename. All output datasets are copied to that snapshot file. Then the file is flushed to storage with H5Fflush. When using MPI, this implicitly invokes MPI_File_sync. Otherwise, in the serial case, fsync must be invoked on the file descriptor retrieved with H5Fget_vfd_handle. After the data has been written to storage, the snapshot file is renamed to a non-temporary filename, which overwrites the previous snapshot file. Since H5Ocopy is a collective call, if the output file is opened by all processes, so must the snapshot file. For now I worked around the issue by keeping the per-node output data in memory until the end of the simulation, thus avoiding H5Ocopy entirely. Regards, Peter _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
