-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi all,
As this also affects .npy files, which uses pickle internally, why can't this be done by Numpy itself? This breaks backwards compatibility in a very bad way in my opinion. The company I worked for uses Numpy and consorts a lot and also has many data in .npy and pickle files. They currently work with 2.7, but I also tried to develop my programs to be compatible with Py 3. But this was not possible when it came to the point of dumping and loading npy files. I think this will be major reason why people won't take the step forward to Py3 and Numpy is not considered to be compatible to Python 3. just my 5 cents, Sebastian On 03/06/2015 04:37 PM, Ryan Nelson wrote: > Arnd, > > I can see where this is an issue. If you are trying to update your code for Py3, I still think that it would really help to add a version attribute of some sort to your new HDF files. You can then write a little check in your access code that looks for this variable. If it is not present, you know that it is an old file, and you can use the trick that I gave you. Otherwise, it will process the file as normal. It could even throw a little error saying that the file is outdated. You could write a small conversion script that could run through old files and reprocess them into the new format. Fortunately, Python is pretty good at automating tasks, even for hundreds of files :) > It might be informative to ask at the PyTables list to see what they've done. The Pandas folks also do a lot with HDF files, and they have certainly worked their way through the Py2-3 transition. Also, because this is an issue with Python pickle, a quick note on SO might get some hits. I tried your script using a lists of list, rather than a list of arrays, and the same problem still persists, so as Pauli notes this is going to be a problem regardless of the type of attributes you set, I think your just going to have to hard code some kind of check in your code to switch behavior. I recently switched to using Py3 exclusively, and although it was painful at first, I'm quite happy with Py3 overall. I also use the Anaconda Python distribution, which makes it very easy to have Py2 and Py3 environments if you need to switch back and forth. > Sorry if that doesn't help much. Just some thoughts from my recent conversion experiences. > > Ryan > > > > On Fri, Mar 6, 2015 at 9:48 AM, Arnd Baecker <arnd.baec...@web.de <mailto:arnd.baec...@web.de>> wrote: > > On Fri, 6 Mar 2015, Pauli Virtanen wrote: > > > Arnd Baecker <arnd.baecker <at> web.de <http://web.de>> writes: > > [clip] > >> Still I would have thought that this should be working out-of-the box, > >> i.e. without the pickle.loads trick? > > > > Pickle files should be considered incompatible between Python 2 and Python 3. > > > > Python 3 interprets all bytes objects saved by Python 2 as str and attempts > > to decode them under some unicode locale. The default locale is ASCII, so it > > will simply just fail in most cases if the files contain any binary data. > > > > Failing by default is also the right thing to do, since the saved bytes > > objects might actually represent strings in some locale, and ASCII is the > > safest guess. > > > > This behavior is that of Python's pickle module, and does not depend on Numpy. > > Thank's a lot for the explanation! > > So what is then the recommded way to save data under python 2 so that > they can still be loaded under python 3? > > For example using np.save with a list of arrays works fine > either on python 2 or on python 3. > However it does not work if one tries to open under python 3 > a file generated before on python 2. > (Again, because pickle is involved internally > "python3.4/site-packages/numpy/lib/npyio.py", > line 393, in load return format.read_array(fid) > File "python34/lib/python3.4/site-packages/numpy/lib/format.py", > line 602, in read_array array = pickle.load(fp) > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 ... > > Just to be clear: I don't want to beat a dead horse here - for my usage > via pytables I was able to solve the loading of old files following > Ryan's solutions. Personally I don't use .npy files. > Maybe saving a list containing arrays is an unusual example ... > > Still, I am a little bit worried about backwards-compatibility: > being able to load old data files is an important issue > as by this it is possible to check whether current code still > reproduces previously obtained (maybe also published) results. > > Best, Arnd > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- > python programming - mail server - photo - video - https://sebix.at > To verify my cryptographic signature or send me encrypted mails, get my > key at https://sebix.at/DC9B463B.asc and on public keyservers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJU+eUjAAoJEBn0X+vcm0Y7/WcQAK1iH3VHffrgEAFq7FU+aDw1 qAkKDcBi82aByr5v3S9zRRpcvYexk0tcNhQCoHUAGZHBCia86Ix1NLx8JT79SjFs wJMxYN8X8r8UcZEuhzw1tMJsflo7UY79CkkzIWPBbdtu5xiVCYkq3O8c3FU3NpZK 9xJPZ5W8+i9pkRDh6i36MuMtncfkbVMTkbo0Dp8DMkkRbQdvK8dfL3NJKZ8dRaIz zYOBBtgVMNcRFvwUnyE+lPYVp2bsDazIoa+6JIvlkWz86Rj6knC5Ehs6L710Bk1G LN0/taZhvRlImLrF8QLgZIhYCpXV45quc8dhkQDP6TOM+9j1LadvfstHPHlCfLBF N4VI7aWKXfAcShb8puaJdLz+F78+esJ7S0tWzRk6ZeJkoY1fBr3kvi3kvyUyy9g/ wV+MQnV1ioptmW+twnmo33AY4IA0qxjwB0uM0PcjjWZY7PrunnDtJRKDll+ruWEm UByUGtu881AbCMVnbTqpoJ+Ri12U0VR8gDn8zHVIUO6Q11v5cMuSOJTV0rls+n2E +7UZCL70UUUYBc//fclUvJ2MOxtfbRFqu3hvghCI5weJmAIn8r7O2D1/2mQvgjgn TqALF/zzJxoHS0EgjjbEsIMFkS1s8NiRJmPD3hWfOteyOogn3GHRYkaYov4YQGD3 YYfdjIWviS0meKMdQD59 =fI60 -----END PGP SIGNATURE----- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion