My recollection is that a developer somewhere in Europe (maybe CERN) developed a convenience API on top of HDF5 that simplified collective dataset a bit by creating an interface where processors work independently to _define_ (names, types, sizes and shapes) the datasets they need to create and then call a collective _sync_ method where all the collective dataset creation happens down in HDF5. Datasets from different ranks that have same attributes (e.g. name, types, size and shape) and are marked with a 'tag' wind up being common across the ranks that passed the same tag. After the collective _sync_ operation, processors can again engage in either independent (or collective) I/O to the datasets.
I have never used that API and I'll be darned if I can remember the name of it (I spent 20 mins looking on Google) and I don't even know if it is still being maintained. But, it does provide a much simpler way of interacting with HDF5's collective dataset creation requirement when that is necessary. It might be an option if you can find it or if another user here familiar with what I am talking about can send a link ;) Mark From: Hdf-forum <[email protected]<mailto:[email protected]>> on behalf of Mohamad Chaarawi <[email protected]<mailto:[email protected]>> Reply-To: HDF Users Discussion List <[email protected]<mailto:[email protected]>> Date: Wednesday, August 31, 2016 at 6:19 AM To: HDF Users Discussion List <[email protected]<mailto:[email protected]>> Subject: Re: [Hdf-forum] Parallel I/O with HDF5 The dataset creation has to be called on all ranks, not the actual writing of the array data. So all ranks should call H5Dcreate() for all the datasets, but then each rank can write to its corresponding dataset. Alternatively, you can have 1 rank create the entire file serially, then close the file, then all other ranks open and write the raw data in parallel. Thanks, Mohamad From: Hdf-forum <[email protected]<mailto:[email protected]>> on behalf of jaber javanshir <[email protected]<mailto:[email protected]>> Reply-To: hdf-forum <[email protected]<mailto:[email protected]>> Date: Tuesday, August 30, 2016 at 4:21 PM To: hdf-forum <[email protected]<mailto:[email protected]>> Subject: [Hdf-forum] Parallel I/O with HDF5 Hi All, Hopa all is well. I am trying to use hdf5 parallel feature for extreme scale computing. I would like each processor write out a dseparate dataset. This question is actually mentioned on the HDF5 website. Due to the collective call every processor has to call the same data set. https://www.hdfgroup.org/HDF5/faq/parallel.html How do you write to a single file in parallel in which different processes write to separate datasets?Please advise on this matter. The answer is not satisfying for the extreme scale computing where hundreds of thousand cores are involved. Is there and better way of overcoming this issue? Your advise on this issue is greatly appreciated Thanks Dr J
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
