17.05.2015 19:47, Elena Pourmal пишет:
Thank you for your input. Global cache is one of the improvements we are
considering to the library. We have more and more applications that open and
process multiple files and dataset. You are absolutely correct that memory
becomes an issue in this case. Global cache is one of the improvements we are
considering to the library to address the problem.
That's good news!
Interface-wise, it seems that a pair of additional routines
H5Pset_global_chunk_cache / H5Pget_global_chunk_cache with arguments as
in H5Pset_chunk_cache / H5Pget_chunk_cache should be introduced. If
H5Pset_global_chunk_cache was called, both limitations (for single
dataset and for total cache size) should become in effect.
The intricate question is how to handle rdcc_w0 parameter (different
datasets may have different values).
Another implementation strategy might be to introduce "dataset groups"
which would share the same cache. This mechanic could be made exclusive
of H5Pset_chunk_cache / H5Pget_chunk_cache, so dataset-specific
rdcc_nslots, rdcc_nbytes, rdcc_w0 lose their effect if cache group is
enabled (during dataset open). It might bring more flexibility, but
might be harder to document and use.
There is one thing to remember: chunk cache is important when a chunk is
compressed and is accessed multiple time. If this is not the case (for example,
application always read a subset that contains the whole chunks), one can
disable chunk cache completely to reduce application memory footprint.
This is clear. My typical work-flow implies multiple access to the same
chunks.
Thank you for your work of HDF5 library,
and best wishes,
Andrey Paramonov
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5