New submission from Antoine Pitrou <pit...@free.fr>: In Python 3.2:
>>> pickle.dumps(b'xyz', protocol=2) b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.' In Python 2.7: >>> pickle.loads(b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.') '[120, 121, 122]' The problem is that the bytes() constructor argument is a list of ints, which gives a different result when reconstructed under 2.x where bytes is an alias of str: >>> pickletools.dis(pickle.dumps(b'xyz', protocol=2)) 0: \x80 PROTO 2 2: c GLOBAL '__builtin__ bytes' 21: q BINPUT 0 23: ] EMPTY_LIST 24: q BINPUT 1 26: ( MARK 27: K BININT1 120 29: K BININT1 121 31: K BININT1 122 33: e APPENDS (MARK at 26) 34: \x85 TUPLE1 35: q BINPUT 2 37: R REDUCE 38: q BINPUT 3 40: . STOP highest protocol among opcodes = 2 Bytearray objects use a different trick: they pass a (unicode string, encoding) pair which has the same constructor semantics under 2.x and 3.x. Additionally, such encoding is statistically more efficient: a list of 1-byte ints will take 2 bytes per encoded char, while a latin1-to-utf8 transcoded string (BINUNICODE uses utf-8) will take on average 1.5 bytes per encoded char (assuming a 50% probability of higher-than-127 bytes). >>> pickletools.dis(pickle.dumps(bytearray(b'xyz'), protocol=2)) 0: \x80 PROTO 2 2: c GLOBAL '__builtin__ bytearray' 25: q BINPUT 0 27: X BINUNICODE 'xyz' 35: q BINPUT 1 37: X BINUNICODE 'latin-1' 49: q BINPUT 2 51: \x86 TUPLE2 52: q BINPUT 3 54: R REDUCE 55: q BINPUT 4 57: . STOP highest protocol among opcodes = 2 ---------- components: Library (Lib) messages: 148635 nosy: alexandre.vassalotti, irmen, pitrou priority: high severity: normal status: open title: Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly in 2.x type: behavior versions: Python 3.2, Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13505> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com