Hello, Le vendredi 12 août 2011 à 14:32 +0200, Xavier Morel a écrit : > On 2011-08-12, at 12:58 , Antoine Pitrou wrote: > > Current protocol versions export object sizes for various built-in types > > (str, bytes) as 32-bit ints. This forbids serialization of large data > > [1]_. New opcodes are required to support very large bytes and str > > objects. > How about changing object sizes to be 64b always? Too much overhead for the > common case (which might be smaller pickled objects)?
Yes, and also the old opcodes must still be supported, so there's no maintenance gain in not exploiting them. > Or a slightly more > devious scheme (e.g. tag-bit, untagged is 31b size, tagged is 63), which > would not require adding opcodes for that? The opcode space is not full enough to justify this kind of complication, IMO. > > Also, dedicated set support > > could help remove the current impossibility of pickling > > self-referential sets [2]_. > > Is there really no possibility of fix recursive pickling once > and for all? Dedicated optcodes for resource consumption > purposes (and to match those of other build-in types) is > still a good idea, but being able to pickle arbitrary > recursive structures would be even better would it not? That's true. Actually, it seems pickling recursive sets could have worked from the start, if a difference __reduce__ had been chosen and a __setstate__ had been defined: >>> class X: pass ... >>> class myset(set): ... def __reduce__(self): ... return (self.__class__, (), list(self)) ... def __setstate__(self, state): ... self.update(state) >>> m = myset((1,2,3)) >>> x = X() >>> x.m = m >>> m.add(x) >>> mm = pickle.loads(pickle.dumps(m)) >>> m myset({1, 2, 3, <__main__.X object at 0x7fe3635c6990>}) >>> mm myset({1, 2, 3, <__main__.X object at 0x7fe3635c6c30>}) # m has a reference loop >>> [x for x in m if getattr(x, 'm', None) is m] [<__main__.X object at 0x7fe3635c6990>] # mm retains a similar reference loop >>> [x for x in mm if getattr(x, 'm', None) is mm] [<__main__.X object at 0x7fe3635c6c30>] # the representation is roughly as efficient as the original one >>> len(pickle.dumps(set([1,2,3]))) 36 >>> len(pickle.dumps(myset([1,2,3]))) 37 We can't change set.__reduce__ (or __reduce_ex__) without a protocol bump, though, since past Pythons would fail loading the pickles. Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com