[Hdf-forum] Why is the chunk_size argument for H5TBmake_table() of a scalar type (hsize_t) ?

J. Lee Wed, 03 Sep 2014 05:50:13 -0700

Hello,

My program that produces the data is written in c++ and I should be able to 
link high level C API library H5TB.
The table has mixed types of ‘fields’ (or columns)  (e.g. uint64, float etc).  
The table can have up to 80 columns and 10000 rows, but it should be able to 
scale to larger dimensions.  The dimension of the table is fixed for each 
production of the table.


On the consumer side, I’m considering PyPandas or PyTables.  The program on 
consumer side needs to apply numeric functions along each column, so storing 
the columns rather than the rows in continuous space is much more efficient for 
the consumer side program.   Performance is more critical on consumer side as 
it is aggregating output from multiple producer programs.

I’m considering H5TB high level API, along with the block write  (i.e. write 
fixed number of rows) approach proposed by Darryl in this forum 
http://hdf-forum.184993.n3.nabble.com/hdf-forum-Efficient-Way-to-Write-Compound-Data-td193448.html#a193447.

On a write to a file,  I would like the column rather than the row for each 
block to be stored in contiguous memory in the H5 file, assuming that this will 
help with performance when PyTables on consumer side accesses the table by 
column (not by row).

Th example codes on Chunking 
http://www.hdfgroup.org/HDF5/doc/Advanced/Chunking/ shows that chunk_dims[2] 
has 2 elements.   For example, if the block has 1000 rows,  I would use 
chunk_dim[2] = {1000, 1} so that the 1000 rows for each column is stored in 
contiguous piece or memory.

Does H5TBmake_table() support such chunking dimension and if so, what is the 
syntax that I would use ?

Thanks!

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

[Hdf-forum] Why is the chunk_size argument for H5TBmake_table() of a scalar type (hsize_t) ?

Reply via email to