Hi Guillaume,
Are you using chunked or contiguous datasets? If chunked, what size
are you using? Also, can you use the “latest” version of the format, which
should be smaller, but is only compatible with HDF5 1.10.x or later? (i.e.
H5Pset_libver_bounds with “latest” for low and high bounds,
https://support.hdfgroup.org/HDF5/doc/RM/H5P/H5Pset_libver_bounds.htm
<https://support.hdfgroup.org/HDF5/doc/RM/H5P/H5Pset_libver_bounds.htm> )
Quincey
> On May 23, 2017, at 3:02 AM, Guillaume Jacquenot
> <[email protected]> wrote:
>
> Hello everyone!
>
> I am creating a HDF5 file from a Fortran program, and I am confused about the
> size of my generated HDF5 file.
>
> I am writing 19000 datasets with 21 values of 64 bit (real number).
> I write one value at a time, and extend with one each of the 19000 datasets
> everytime.
> All data are correctly written.
> But the generated file is more than 48 Mo.
> I expected the total size of the file to be a little bigger than the raw
> data, about 3.2Mo (21*19000*8 / 1e6=3.192Mo)
> If I only create 19000 empty datasets, I obtain a 6Mo Hdf5 file, which means
> each empty dataset is about 400 bytes.
> I guess I could create a ~10 Mo (6Mo + 3.2Mo) Hdf5 file that can contain
> everything.
>
> For comparaison,if I write everything in a text file, where each real number
> is written with 15 characters, I obtain a 6 Mo CSV file.
>
> Question 1)
> Is this behaviour normal?
>
> Question 2)
> Does extending dataset each time we write data inside can significantly
> increase the total required space disk size?
> Does preallocating dataset and using hyperslab can save some space?
> Does chunk parameters can impact the size of generated hdf5 file
>
> Question 3)
> If I pack everything in a compound dataset with 19000 columns, will the
> result file be smaller?
>
> N.B:
> When looking at the example of generating 100000 groups (grplots.c),the size
> of the generated HD5 file is 78 Mo for 100000 empty groups
> That means each group is about 780 bytes
> https://support.hdfgroup.org/ftp/HDF5/examples/howto/crtmany/grplots.c
> <https://support.hdfgroup.org/ftp/HDF5/examples/howto/crtmany/grplots.c>
>
> Guillaume Jacquenot
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5