Hello- we are using some block-level deduping infrastructure that allows us to synchronize files around our enterprise. To make this most effective, we need the beginning of files to be as stable as possible.
We have HDF5 files that range from .5 to 20GBs, and generally alter only 5% of the data in specific datasets after then initial creation. We would like to structure these such that we can take the most advantage of the aforementioned deduping. Questions: 1) It appears that H5Pset_fapl_split() is the direction to look to separate data from meta data. Is this fully supported? Any performance issues with these drivers over the single-file type? 2) Is there a way to specify where a particular dataset is stored? E.g., in my ideal scenario, I would have 3 files: 1) for my metadata which is potentially most volatile as blocks change; 2) for data sets where I am altering data which would be somewhat volatile; 3) the last file for my most static data. Any other advice or practical experience in this regard? Best regards, --Jim
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
