Hello all,

The HDF5 faq (https://www.hdfgroup.org/HDF5/faq/limits.html) refer to an 
example that create 100'000 groups in the 'How many links can be in a group?' 
section.

My problem is that I need to create at least 1'000'000 groups in a single file, 
and the creation time decrease a lot after about 900'000.
The application is written in C++ with hdf 1.8.5, running on Windows 7-64 16Gb 
ram.

For a faster investigation, I wrote a very single python example and I can 
reproduce this issue on iMac 64bit, 32Gb ram, OSX 10.11.
The average time is  between 6-7 seconds to create 100'000 groups, and became 
about 6 minutes after 900'000 groups are created!!!

I suppose that I need to configure something in HDF5 to avoid this kind of 
issue, i.e. set a greater cache size, or anything else...
I'll really appreciate if someone know the reason of this behavior!
Here is the python example with the produced output.
Best regards,
Levent


import h5py as h5

from datetime import datetime



print(h5.version.info)

hf = h5.File("f.h5", "w")

print(str(datetime.now())) # start timestamp



for i in range(1, 1000000):

    hf.create_group("/Acquisition."+str(i)) # create a group

    if not i % 100000:

        print(str(datetime.now()) + ' : ' + str(i)) # time stamp on each 
100'000 groups created



print(str(datetime.now())) # end timestamp



Summary of the h5py configuration

---------------------------------

h5py    2.5.0

HDF5    1.8.13

Python  3.5.0 (default, Sep 14 2015, 02:37:27) [GCC 4.2.1 Compatible Apple LLVM 
6.1.0 (clang-602.0.53)]

sys.platform    darwin

sys.maxsize     9223372036854775807

numpy   1.10.1



2015-11-25 10:16:48.109794

2015-11-25 10:16:54.340278 : 100000

2015-11-25 10:17:00.661270 : 200000

2015-11-25 10:17:07.006722 : 300000

2015-11-25 10:17:13.435274 : 400000

2015-11-25 10:17:19.829139 : 500000

2015-11-25 10:17:27.221807 : 600000

2015-11-25 10:17:33.599402 : 700000

2015-11-25 10:17:39.979077 : 800000

2015-11-25 10:17:46.284342 : 900000

2015-11-25 10:23:36.377318

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to