David Warde-Farley wrote: > On 23-May-09, at 4:25 PM, Albert Thuswaldner wrote: >> Actually my vision with pyhdf5io is to have hdf5 to replace numpy's >> own binary file format (.npy, npz). Pyhdf5io (or an incarnation of it) >> should be the standard (binary) way to store data in scipy/numpy. A >> bold statement, I know, but I think that it would be an improvement, >> especially for those users how are replacing Matlab with sicpy/numpy. >> > In that it introduces a dependency on pytables (and the hdf5 C > library) I doubt it would be something the numpy core developers would > be eager to adopt. > > The npy and npz formats (as best I can gather) exist so that there is > _some_ way of persisting data to disk that ships with numpy. It's not > meant necessarily as the best way, or as an interchange format, just > as something that works "out of the box", the code for which is > completely contained within numpy. > > It might be worth mentioning the limitations of numpy's built-in > save(), savez() and load() in the docstrings and recommending more > portable alternatives, though. > > David >
I tend to agree with David that PyTables is too big a dependency for inclusion in core Numpy. It does a lot more than simply loading and saving arrays. While I haven't tried Andrew Collette's h5py (http://code.google.com/p/h5py), it looks like a very 'thin' wrapper around the HDF5 C libraries. Maybe numpy's save(), savez(), load(), memmap() could be enhanced so that saving/loading files with HDF5-like file extensions used the HDF5 format, with code based on h5py and pyhdf5io. This could, I imagine, be a relatively small/simple addition to numpy, with the only external dependency being the HDF5 libraries themselves. Stephen _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion