On 12 Set, 14:39, "Aaron \"Castironpi\" Brady" <[EMAIL PROTECTED]> wrote: > > A consideration of other storage formats such as HDF5 might > > be appropriate: > > >http://hdf.ncsa.uiuc.edu/HDF5/whatishdf5.html > > > There are, of course, HDF5 tools available for Python. > > PyTablescame up within the past few weeks on the list. > > "When the file is created, the metadata in the object tree is updated > in memory while the actual data is saved to disk. When you close the > file the object tree is no longer available. However, when you reopen > this file the object tree will be reconstructed in memory from the > metadata on disk...." > > This is different from what I had in mind, but the extremity depends > on how slow the 'reconstructed in memory' step is. > (Fromhttp://www.pytables.org/docs/manual/ch01.html#id2506782). The > counterexample would be needing random access into multiple data > files, which don't all fit in memory at once, but the maturity of the > package might outweigh that. Reconstruction will form a bottleneck > anyway.
Hmm, this was a part of a documentation that needed to be updated. Now, the object tree is reconstructed in a lazy way (i.e. on-demand), in order to avoid the bottleneck that you mentioned. I have corrected the docs in: http://www.pytables.org/trac/changeset/3714/trunk Thanks for (indirectly ;-) bringing this to my attention, Francesc -- http://mail.python.org/mailman/listinfo/python-list