Hi again, * onefire <onefire.mys...@gmail.com> [2014-04-18]: > I think your workaround might help, but a better solution would be to not > use Python's zipfile module at all. This would make it possible to, say, > let the user choose the checksum algorithm or to turn that off. > Or maybe the compression stuff makes this route too complicated to be worth > the trouble? (after all, the zip format is not that hard to understand)
Just to give you an idea of what my aforementioned Bloscpack library can do in the case of linspace: In [1]: import numpy as np In [2]: import bloscpack as bp In [3]: import bloscpack.sysutil as bps In [4]: x = np.linspace(1, 10, 50000000) In [5]: %timeit np.save("x.npy", x) ; bps.sync() 1 loops, best of 3: 2.12 s per loop In [6]: %timeit bp.pack_ndarray_file(x, 'x.blp') ; bps.sync() 1 loops, best of 3: 627 ms per loop In [7]: %timeit -n 3 -r 3 np.save("x.npy", x) ; bps.sync() 3 loops, best of 3: 1.92 s per loop In [8]: %timeit -n 3 -r 3 bp.pack_ndarray_file(x, 'x.blp') ; bps.sync() 3 loops, best of 3: 564 ms per loop In [9]: ls -lah x.npy x.blp -rw-r--r-- 1 root root 49M Apr 18 12:53 x.blp -rw-r--r-- 1 root root 382M Apr 18 12:52 x.npy However, this is a bit of special case, since Blosc does extremely well -- both speed and size wise -- on the linspace data, your milage may vary. best, V- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion