[Hdf-forum] File corruption and hdf5 design considerations

Eelco Hoogendoorn Mon, 13 Aug 2012 06:10:43 -0700

As much as I love hdf5 (and pytables), I find that it becomesincreasingly unusable when storing large amounts of data, withpotentially troublesome code.

I have already learned the hard way to never store original experimentaldata in any database that might be opened with write access; and now Iam finding that storing several days worth of simulation data in hdf5isnt quite feasible either. Perhaps itd be fine when my code is all doneand bug free; for now, it crashes frequently. Thats part of development,but id like to be able to do development without losing days worth ofdata at a time, AND use hdf5.

My question then is: what are best practices for dealing with thesekinds of situations? One thing I am doing at the moment is splitting mydata over several different .h5 files, so writing to one table can nottake my whole dataset down with it. It is unfortunate though, thatstandard OS file systems are more robust than hdf5; id rather see it theother way around.

I understand that there isnt much one can do about a program crashing inthe middle of a binary tree update; that is not going to be pretty. ButI could envision a rather simple solution to that; just keep one or morefully redundant metadata structures in memory, and only ever write toone at the same time. If one becomes corrupted, at least you still haveall your data since your last flush available. I could not care less forthe extra disk space overhead, but in case anyone does, it should beeasy to make the number of histories of the metadata maintained optional.

Is there already such functionality that I have not noticed, is it (orshould it) be planned functionality, or am I missing any othertechniques for dealing with these type of situations?


Thank you for your input,
Eelco Hoogendoorn

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

[Hdf-forum] File corruption and hdf5 design considerations

Reply via email to