Hi,

I am working with a 10000 row by 200000 column matrix of 4-byte floats.  The 
matrix is written unavoidably in sequential row-major order, but needs to be 
read in a  sequential column-wise order.  I chunk the matrix into 1000x1000 
chunks, to compromise on write performance (row-wise) and read performance 
(column-wise).

When writing the file, I set the chunk cache to be big enough to hold an entire 
row’s worth of chunks (i.e. 200000 / 1000 chunks multiplied by 4e6 bytes).  My 
write times per row are of the order of 5ms, and the algorithm pauses after 
each 1000 rows.  By monitoring I/O at the filesystem level, I see spikes of 
disk activity during these pauses, with transfer rates approaching maximum.  I 
conclude that the chunk cache is effectively buffering 1000 rows of the matrix, 
and flushing to disk only when all chunks have been written.  So far so good — 
HDF5 is making my life easy :)

However when reading the file, I reserve enough chunk cache to accommodate a 
column’s worth of chunks (10000 / 1000 chunks multiplied by 4e6 bytes).  My 
column read time is of the order of 10ms, but I don’t see pauses or spikes of 
disc activity as with the write.  Instead, I get a steady trickle of disc 
activity that does not appear to be correlated with the chunk width as I was 
expecting.  Therefore, it appears that the chunk cache is not being used.

1) Should I expect this behaviour?
2) Have I set up the chunk cache correctly (code below), and do I have to 
explicitly tell HDF5 to read data chunk-wise from a chunked-layout file?
3) How best to monitor cache flushing/pre-emption activity?

Thanks,

Chris


Sample C++ code: 

 // Cache
 H5::FileAccPropList fprops = file.getAccessPlist();
 int mdc;
 size_t ccelems;
 size_t ccnbytes;
 double w0;
 fprops.getCache(mdc, ccelems, ccnbytes, w0);

 size_t chunksPerCol = 10000 / 1000;
 ccnbytes = chunksPerCol * chunkDim[0] * chunkDim[1] * sizeof(float); 

 fprops.setCache(mdc, ccelems, ccnbytes, w0);


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to