On 24-May-09, at 5:22 PM, Robert Kern wrote: >> While I haven't tried Andrew Collette's h5py >> (http://code.google.com/p/h5py), it looks like a very 'thin' wrapper >> around the HDF5 C libraries. Maybe numpy's save(), savez(), load(), >> memmap() could be enhanced so that saving/loading files with HDF5- >> like >> file extensions used the HDF5 format, with code based on h5py and >> pyhdf5io. This could, I imagine, be a relatively small/simple >> addition >> to numpy, with the only external dependency being the HDF5 libraries >> themselves. > > *libhdf5* is too big, not PyTables.
Yup. According to sloccount, numpy is roughly ~210,000 lines of code. The hdf5 library is ~385,000 lines. Including even a small part of libhdf5 would grow the code base significantly, and requiring it as a dependency isn't a good idea since libhdf5 can be tricky to build right. As Robert's design document for the NPY format says, one option would be to implement a minimal subset of the HDF5 protocol *from scratch* (that would be required for saving NumPy arrays as top-level leaf nodes, for example). This would also sidestep any tricky licensing issues (I don't know what the HDF5 license is in particular, I know it's fairly permissive but still might not be suitable for including any of it in NumPy). David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion