Re: [Hdf-forum] parallel HDF5: H5Fclose hangs when not using a power of 2 number of processes

Mohamad Chaarawi Wed, 08 Apr 2015 09:32:00 -0700

Hi Wolf,

I think you are confusing the dataset layout on disk vs Raw data I/O on the 
dataset itself.
I suggest you go through these and maybe it would clear things up more for you:
http://www.hdfgroup.org/HDF5/doc/UG/UG_frame10Datasets.html  - section 5.5 for 
space allocation
http://www.hdfgroup.org/HDF5/doc/UG/UG_frame12Dataspaces.html for raw data I/O 
on datasets

More documentation and examples is available here:
http://www.hdfgroup.org/HDF5/Tutor/parallel.html

As a rule of thumb for parallel HDF5 users, I suggest that if you don't 
understand what dataset chunking is and what it does, don't use it since it 
will probably hurt your performance; use contiguous layout for datasets instead 
(you can accomplish that by removing the H5Pset calls on the dataset creation 
property list for H5Dcreate).

Thanks,
Mohamad

-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of Wolf 
Dapp
Sent: Wednesday, April 08, 2015 10:58 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] parallel HDF5: H5Fclose hangs when not using a power 
of 2 number of processes

On 04/08/15 17:15, Mohamad Chaarawi wrote:
> Hi Wolf,
> 
> It is OK to have to read/write different amount of elements/data from 
> each processor. That is not the problem. The problem is that you 
> cannot have each processor specify a different layout of the dataset 
> on disk. This is the same problem for example as having 1 process says 
> the layout of the dataset is contiguous and other says it's chunked.
> 
> The solution is very simple.. just don't adjust the chunk size for the 
> dataset on the last process.
> 
> I modified the replicator that you provided and attached to 
> demonstrate how this would work (I didn't do a lot of testing on it, 
> just on my local machine, but it should work fine).
> 
> Thanks, Mohamad

Okay, so the only thing you did is to move the H5Pxxxx calls up, /before/ the 
H5Sxxxx calls, and give them each the same arguments?

BTW, why shouldn't the line read something like chunk_dims[0] = (nx%iNumOfProc) 
? nx/iNumOfProc+1 : nx/iNumOfProc; Why does your version still work for np != 
2^X even though the chunks will be too small? (on the other hand, with the 
above, the added size of the chunks will be too large, and a chunk size of 1 
also seems to work...)

I don't quite understand what this does in general, I guess. Now each processor 
has the same chunk size. However, the memspaces and hyperslabs are still 
different. Why aren't those calls collective?

Does the chunk size only mean that each process writes the data it owns in 
chunks of the given size? If one chunk is not enough it simply writes a 
second/third/fourth chunk? And if the data is smaller than the chunk, it writes 
whatever it has? Is that how it works?

Thank you very much for your help, Mohamad! Thanks to Mark and Timothy for 
their input, too! Much appreciated!

Cheers,
Wolf

-- 

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] parallel HDF5: H5Fclose hangs when not using a power of 2 number of processes

Reply via email to