Hi Wolf, I think you are confusing the dataset layout on disk vs Raw data I/O on the dataset itself. I suggest you go through these and maybe it would clear things up more for you: http://www.hdfgroup.org/HDF5/doc/UG/UG_frame10Datasets.html - section 5.5 for space allocation http://www.hdfgroup.org/HDF5/doc/UG/UG_frame12Dataspaces.html for raw data I/O on datasets
More documentation and examples is available here: http://www.hdfgroup.org/HDF5/Tutor/parallel.html As a rule of thumb for parallel HDF5 users, I suggest that if you don't understand what dataset chunking is and what it does, don't use it since it will probably hurt your performance; use contiguous layout for datasets instead (you can accomplish that by removing the H5Pset calls on the dataset creation property list for H5Dcreate). Thanks, Mohamad -----Original Message----- From: Hdf-forum [mailto:[email protected]] On Behalf Of Wolf Dapp Sent: Wednesday, April 08, 2015 10:58 AM To: HDF Users Discussion List Subject: Re: [Hdf-forum] parallel HDF5: H5Fclose hangs when not using a power of 2 number of processes On 04/08/15 17:15, Mohamad Chaarawi wrote: > Hi Wolf, > > It is OK to have to read/write different amount of elements/data from > each processor. That is not the problem. The problem is that you > cannot have each processor specify a different layout of the dataset > on disk. This is the same problem for example as having 1 process says > the layout of the dataset is contiguous and other says it's chunked. > > The solution is very simple.. just don't adjust the chunk size for the > dataset on the last process. > > I modified the replicator that you provided and attached to > demonstrate how this would work (I didn't do a lot of testing on it, > just on my local machine, but it should work fine). > > Thanks, Mohamad Okay, so the only thing you did is to move the H5Pxxxx calls up, /before/ the H5Sxxxx calls, and give them each the same arguments? BTW, why shouldn't the line read something like chunk_dims[0] = (nx%iNumOfProc) ? nx/iNumOfProc+1 : nx/iNumOfProc; Why does your version still work for np != 2^X even though the chunks will be too small? (on the other hand, with the above, the added size of the chunks will be too large, and a chunk size of 1 also seems to work...) I don't quite understand what this does in general, I guess. Now each processor has the same chunk size. However, the memspaces and hyperslabs are still different. Why aren't those calls collective? Does the chunk size only mean that each process writes the data it owns in chunks of the given size? If one chunk is not enough it simply writes a second/third/fourth chunk? And if the data is smaller than the chunk, it writes whatever it has? Is that how it works? Thank you very much for your help, Mohamad! Thanks to Mark and Timothy for their input, too! Much appreciated! Cheers, Wolf -- _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
