[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-13 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: Fixed. Thanks for the patch! -- assignee: -> alexandre.vassalotti resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed ___ Python tracker _

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-13 Thread Roundup Robot
Roundup Robot added the comment: New changeset 14695b4825dc by Alexandre Vassalotti in branch '3.2': Issue #13505: Make pickling of bytes object compatible with Python 2. http://hg.python.org/cpython/rev/14695b4825dc -- nosy: +python-dev ___ Python t

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-12 Thread sbt
sbt added the comment: > Which is fine. 'bytes' and byte literals were not introduced until > 2.6 [1,2]. So *any* solution we come > up with is for >= 2.6. In 2.6 and 2.7, bytes is just an alias for str. In all 2.x versions with codecs.encode, the result will be str. (Although I haven't ac

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-12 Thread Meador Inge
Meador Inge added the comment: On Sun, Dec 11, 2011 at 12:17 PM, sbt wrote: >> I don't really know that much about pickle, but Antoine mentioned that >> 'bytearray' >> works fine going from 3.2 to 2.7.  Given that, can't we just compose 'bytes' >> with >> 'bytearray'? > > Yes, although it wo

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-12 Thread sbt
sbt added the comment: I now realise latin_1_encode won't work because it returns a pair (bytes_obj, length). I have done a patch using _codecs.encode instead -- the pickles turn out to be exactly the same size anyway. >>> pickletools.dis(pickle.dumps(b"abc", 2)) 0: \x80 PROTO 2

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Only worry is that codecs.latin_1_encode.__module__ is '_codecs', and > _codecs is undocumented. It seems we have to choose between two evils here. Given that the codecs.latin_1_encode produces more compact pickles, I'd say go for it. Note that for the empt

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-11 Thread sbt
sbt added the comment: > I don't really know that much about pickle, but Antoine mentioned that > 'bytearray' > works fine going from 3.2 to 2.7. Given that, can't we just compose 'bytes' > with > 'bytearray'? Yes, although it would only work for 2.6 and 2.7. codecs.encode() seems to be ava

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-10 Thread Meador Inge
Meador Inge added the comment: I don't really know that much about pickle, but Antoine mentioned that 'bytearray' works fine going from 3.2 to 2.7. Given that, can't we just compose 'bytes' with 'bytearray'? Something like: Python 3.3.0a0 (default:aab45b904141+, Dec 10 2011, 13:34:41) [GCC

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > sbt, the bug is not that the encoding is inefficient. The problem is we > > cannot unpickle bytes streams from Python 3 using Python 2. > > Ah. Well you can do it using codecs.encode. Great. A bit hackish but functional and not too inefficient (50% avera

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-09 Thread sbt
sbt added the comment: > sbt, the bug is not that the encoding is inefficient. The problem is we > cannot unpickle bytes streams from Python 3 using Python 2. Ah. Well you can do it using codecs.encode. Python 3.3.0a0 (default, Dec 8 2011, 17:56:13) [MSC v.1500 32 bit (Intel)] on win32 Typ

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-08 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: sbt, the bug is not that the encoding is inefficient. The problem is we cannot unpickle bytes streams from Python 3 using Python 2. -- ___ Python tracker __

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-06 Thread sbt
sbt added the comment: > One *dirty* trick I am thinking about would be to use something like > array.tostring() to construct the byte string. array('B', ...) objects are pickled using two bytes per character, so there would be no advantage: >>> pickle.dumps(array.array('B', b"hello"), 2)

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-12-05 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: I think we are kind of stuck here. I might need to rely on some clever hack to generate the desired str object in 2.7 without breaking the bytes support in 3.3 and without changing 2.7 itself. One *dirty* trick I am thinking about would be to use someth

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-11-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: After a bit of testing, my idea was flawed, as str() doesn't accept an encoding parameter in 2.x: `str(u'foo', 'latin1')` simply raises a TypeError. -- ___ Python tracker ___

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-11-29 Thread Meador Inge
Changes by Meador Inge : -- nosy: +meador.inge stage: -> needs patch ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubsc

[issue13505] Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x

2011-11-29 Thread Antoine Pitrou
New submission from Antoine Pitrou : In Python 3.2: >>> pickle.dumps(b'xyz', protocol=2) b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.' In Python 2.7: >>> pickle.loads(b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.') '[120, 121, 122]' The problem is th