to, 2009-12-03 kello 13:04 +0100, René Dudfield kirjoitti: [clip] > In other news, we cannot support Py2 pickles in Py3 -- this is > because > Py2 str is unpickled as Py3 str, resulting to encoding > failures even > before the data is passed on to Numpy. > > Is this just for the type codes? Or is there other string data that > needs to be pickle loaded? If it is just for the type codes, they are > all within the ansi character set and unpickle fine without errors. > I'm guessing numpy uses strings to pickle arrays?
The array data is put in a string in __reduce__. The dtype is IIRC mostly stored using integers, though endianness is stored with a character. Actually, now that I look more closely, Py3 pickle.load takes an 'encoding' argument, which will perhaps help here. We should probably just instruct users to pass 'latin1' there in Py3 if they want backwards compatibility. The Numpy __reduce__ and __setstate__ C code must then just be checked for compatibility. [clip] > Using the python array module to store data might be the way to > go(rather than strings), since that is available in both py2 and py3. The array module has the same problem as Numpy, so using it will not help: $ python Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) >>> import array >>> c = array.array('b', '123öä') >>> c array('b', [49, 50, 51, -61, -74, -61, -92]) >>> f = open('foo.pck', 'w'); pickle.dump(c, f); f.close() $ python3 Python 3.0.1+ (r301:69556, Apr 15 2009, 15:59:22) >>> import pickle >>> f = open('foo.pck', 'rb') >>> pickle.load(f) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.0/pickle.py", line 1335, in load return Unpickler(file, encoding=encoding, errors=errors).load() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) The 'encoding' argument does not actually help array module, but that may be just because of some incompatible __setstate__ stuff in 'array'. [clip] > A set of pickles saved from python2 would be useful for testing. > Forwards compatibility is also a useful thing to test. That is py3.1 > pickles saved to be loaded with python2 numpy. In Py3 it would be very convenient to __getstate__ the array data in Bytes (e.g. space savings!), which will be forward incompatible, unless the Py2 side has a custom unpickler. -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion