Dear all, when preparing the transition of our repositories from python 2 to python 3, I encountered a problem loading pytables (.h5) files generated using python 2. I suspect that it is caused by a problem with pickling numpy arrays under python 3:
The code appended at the end of this mail works fine on either python 2.7 or python 3.4, however, generating the data on python 2 and trying to load them on python 3 gives some strange string ( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...) instead of [array([ 0., 1., 2., 3., 4., 5.]), array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])] The problem sounds very similar to the one reported here https://github.com/numpy/numpy/issues/4879 which was fixed with numpy 1.9. I tried different versions/combintations of numpy (including 1.9.2) and always end up with the above result. Also I tried to reduce the problem down to the level of pure numpy and pickle (as in the above bug report): import numpy as np import pickle arr1 = np.linspace(0.0, 1.0, 2) arr2 = np.linspace(0.0, 2.0, 3) data = [arr1, arr2] p = pickle.dumps(data) print(pickle.loads(p)) p Using the resulting string for p as input string (with b added at the beginnung) under python 3 gives UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 14: ordinal not in range(128) Can someone reproduce the problem with pytables? Is there maybe work-around? (And no: I can't re-generate the "old" data files - it's hundreds of .h5 files ... ;-). Many thanks, best, Arnd ############################################################################## """Illustrate problem with pytables data - python 2 to python 3.""" from __future__ import print_function import sys import numpy as np import tables as tb def main(): """Run the example.""" print("np.__version__=", np.__version__) check_on_same_version = False arr1 = np.linspace(0.0, 5.0, 6) arr2 = np.linspace(0.0, 10.0, 11) data = [arr1, arr2] # Only generate on python 2.X or check on the same python version: if sys.version < "3.0" or check_on_same_version: fpt = tb.open_file("tstdat.h5", mode="w") fpt.set_node_attr(fpt.root, "list_of_arrays", data) fpt.close() # Load the saved file: fpt = tb.open_file("tstdat.h5", mode="r") result = fpt.get_node_attr("/", "list_of_arrays") fpt.close() print("Loaded:", result) main() _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion