[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Patch updated (comment for load_binstring added). -- Added file: http://bugs.python.org/file28097/pickle_nonportable_size_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12848

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Roundup Robot
Roundup Robot added the comment: New changeset 55fe4b57dd9c by Antoine Pitrou in branch '3.2': Issue #12848: The pure Python pickle implementation now treats object lengths as unsigned 32-bit integers, like the C implementation does. http://hg.python.org/cpython/rev/55fe4b57dd9c New changeset

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Antoine Pitrou
Antoine Pitrou added the comment: I've committed the latest patch (pickle_nonportable_size_2.patch). Thank you for working on this! -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a patch for 3.x. It unify behavior of Python and C implementations and unify behavior on 32- and 64-bit platforms. For backward compatibility Pickler can pickle up to 2G data, but Unpickler can unpickle up to 4G on 64-bit. I agree the right

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: I'd like to add that anyone wanting to serialize large data will certainly be using _pickle (or its ancestor cPickle), since using pickle.py is probably excruciatingly slow. Meaning we should favour preserving _pickle/cPickle's behaviour over preserving

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The issue is not only in difference between Python and C implementations, but also between 32-bit and 64-bit. pickle.py on 32-bit accepts data up to 2G. pickle.py on 64-bit accepts data up to 2G. _pickle.c on 32-bit accepts data up to 2G. _pickle.c on 64-bit

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Martin v . Löwis
Martin v. Löwis added the comment: IMO, the right solution is to finish PEP 3154, and support large strings in the format. For the time being, I'd claim that signed length in the existing implementations are just a bug, and that unsigned lengths are the intended semantics of these opcodes. I

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch for 3.x which extends supported size to 4G on 64-bit. -- Added file: http://bugs.python.org/file28010/pickle_nonportable_size.patch ___ Python tracker rep...@bugs.python.org

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: OTOH, I also think that it won't matter much in practive: if you try to unpickle a string with more than 2GiB on a 32-bit system, chances are really high that you run out of memory. Agreed. I think this issue is mostly about 64-bit systems, even though we

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch for 3.x. It unify behavior of Python and C implementations and unify behavior on 32- and 64-bit platforms. For backward compatibility Pickler can pickle up to 2G data, but Unpickler can unpickle up to 4G on 64-bit. -- keywords:

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The C implementation writes and reads BINBYTES and BINUNICODE up to 4G (on 64-bit platform). The Python implementation writes and reads BINBYTES and BINUNICODE up to 2G. What should be compatible fix? Allow the Python implementation to write and read up to

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What if just add 0x? -- nosy: +serhiy.storchaka versions: +Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12848 ___

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ah, for unpacking 32-bit unsigned big-endian bytes you can use len = int.from_bytes(self.read(4), 'big'). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12848

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Or you can use len = struct.unpack('I', self.read(4)). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12848 ___

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: pickle.py is the buggy one here. Its use of the marshal module is really a hack. Plus, it is slower than both struct and int.from_bytes. 14:40:57 [~/cpython]$ ./python -m timeit int.from_bytes(b'\xff\xff\xff\xff', 'big') 100 loops, best of 3: 0.209

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2011-08-27 Thread Antoine Pitrou
New submission from Antoine Pitrou pit...@free.fr: In several opcodes (BINBYTES, BINUNICODE... what else?), _pickle.c happily accepts 32-bit lengths of more than 2**31, while pickle.py uses marshal's i typecode which means signed... and therefore fails reading the data. Apparently, pickle.py