Hi, I am working with a 10000 row by 200000 column matrix of 4-byte floats. The matrix is written unavoidably in sequential row-major order, but needs to be read in a sequential column-wise order. I chunk the matrix into 1000x1000 chunks, to compromise on write performance (row-wise) and read performance (column-wise).
When writing the file, I set the chunk cache to be big enough to hold an entire row’s worth of chunks (i.e. 200000 / 1000 chunks multiplied by 4e6 bytes). My write times per row are of the order of 5ms, and the algorithm pauses after each 1000 rows. By monitoring I/O at the filesystem level, I see spikes of disk activity during these pauses, with transfer rates approaching maximum. I conclude that the chunk cache is effectively buffering 1000 rows of the matrix, and flushing to disk only when all chunks have been written. So far so good — HDF5 is making my life easy :) However when reading the file, I reserve enough chunk cache to accommodate a column’s worth of chunks (10000 / 1000 chunks multiplied by 4e6 bytes). My column read time is of the order of 10ms, but I don’t see pauses or spikes of disc activity as with the write. Instead, I get a steady trickle of disc activity that does not appear to be correlated with the chunk width as I was expecting. Therefore, it appears that the chunk cache is not being used. 1) Should I expect this behaviour? 2) Have I set up the chunk cache correctly (code below), and do I have to explicitly tell HDF5 to read data chunk-wise from a chunked-layout file? 3) How best to monitor cache flushing/pre-emption activity? Thanks, Chris Sample C++ code: // Cache H5::FileAccPropList fprops = file.getAccessPlist(); int mdc; size_t ccelems; size_t ccnbytes; double w0; fprops.getCache(mdc, ccelems, ccnbytes, w0); size_t chunksPerCol = 10000 / 1000; ccnbytes = chunksPerCol * chunkDim[0] * chunkDim[1] * sizeof(float); fprops.setCache(mdc, ccelems, ccnbytes, w0); _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
