Re: [petsc-users] Unable to create >4GB sized HDF5 files on Cray XC30

Jed Brown Sun, 18 Aug 2013 06:11:18 -0700

Juha Jäykkä <[email protected]> writes:

> For small files, chunking is probably not going to change performance in any 
> significant manner, so one option could be to simply not chunk small files at 
> all


This is effectively what is done now, considering that HDF5 needs
chunking to be enabled to use H5S_UNLIMITED.

> and then chunk big files "optimally" – whatever that means. HDFgroup
> seems to think that "the chunk size be approximately equal to the
> average expected size of the data block needed by the application."
> (http://www.hdfgroup.org/training/HDFtraining/UsersGuide/Perform.fm2.html)
> For more chunking stuff:
>
> In the case of PETSc I think that means not the WHOLE application, but one 
> MPI 
> rank (or perhaps one SMP host running a mixture of MPI ranks and OpenMP 
> threads), which is probably always going to be < 4 GB (except perhaps in the 
> mixture case).

Output uses a collective write, so the granularity of the IO node is
probably more relevant for writing (e.g., BG/Q would have one IO node
per 128 compute nodes), but almost any chunk size should perform
similarly.  It would make a lot more difference for something like
visualization where subsets of the data are read, typically with
independent IO.

> turning chunking completely off works too

Are you sure?  Did you try writing a second time step?  The
documentation says that H5S_UNLIMITED requires chunking.

> See above, but note also that there can at most be 64k chunks in the file, so 
> fixing the chunk size to 10 MiB means limiting file size to 640 GiB.

Thanks for noticing this limit.  This might come from the 64k limit
on attribute sizes.

> My suggestion is to give PETSc a little more logic here, something like this:
>
> if sizeof(data) > 4GiB * 64k: no chunking # impossible to chunk!
> elif sizeof(data) < small_file_limit: no chunking # probably best for speed
> elif current rank's data size < 4 GB: chunk using current ranks data size

Chunk size needs to be collective.  We could compute an average size
From each subdomain, but can't just use the subdomain size.

> else divide current rank's data size by 2**(number of dimensions) until < 4 
> GB 
> and then use that chunk size.

We might want the chunk size to be smaller than 4GiB anyway to avoid
out-of-memory problems for readers and writers.


I think the chunk size (or maximum chunk size) should be settable by the user.

pgpFUakWFEjqf.pgp
Description: PGP signature

Re: [petsc-users] Unable to create >4GB sized HDF5 files on Cray XC30

Reply via email to