On Sat, Jun 23, 2012 at 3:19 AM, M Stefan <mstefa...@gmail.com> wrote:
> * UNION_FROZENSET: like UPDATE_SET, but create a new frozenset > stack before: ... pyfrozenset mark stackslice > stack after : ... pyfrozenset.union(stackslice) > Since frozenset are immutable, could you explain how adding the UNION_FROZENSET opcode helps in pickling self-referential frozensets? Or are you only adding this one to follow the current style used for pickling dicts and lists in protocols 1 and onward? > While this design allows pickling of self-referenti/Eal sets, > self-referential > frozensets are still problematic. For instance, trying to pickle `fs': > a=A(); fs=frozenset([a]); a.fs = fs > (when unpickling, the object a has to be initialized before it is added to > the frozenset) > > The only way I can think of to make this work is to postpone > the initialization of all the objects inside the frozenset until after > UNION_FROZENSET. > I believe this is doable, but there might be memory penalties if the > approach > is to simply store all the initialization opcodes in memory until pickling > the frozenset is finished. > > I don't think that's the only way. You could also emit POP opcode to discard the frozenset from stack and then emit a GET to fetch it back from the memo. This is how we currently handle self-referential tuples. Check out the save_tuple method in pickle.py to see how it is done. Personally, I would prefer that approach because it already well-tested and proven to work. That said, your approach sounds good too. The memory trade-off could lead to smaller pickles and more efficient decoding (though these self-referential objects are rare enough that I don't think that any improvements there would matter much). While self-referential frozensets are uncommon, a far more problematic > situation is with the self-referential objects created with REDUCE. While > pickle uses the idea of creating empty collections and then filling them, > reduce tipically creates already-filled objects. For instance: > cnt = collections.Counter(); cnt[a]=3; a.cnt=cnt; cnt.__reduce__() > (<class 'collections.Counter'>, ({<__main__.A object at 0x0286E8F8>: 3},)) > where the A object contains a reference to the counter. Unpickling an > object pickled with this reduce function is not possible, because the > reduce > function, which "explains" how to create the object, is asking for the > object > to exist before being created. > Your example seems to work on Python 3. I am not sure if I follow what you are trying to say. Can you provide a working example? $ python3 Python 3.1.2 (r312:79147, Dec 9 2011, 20:47:34) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pickle, collections >>> c = collections.Counter() >>> class A: pass ... >>> a = A() >>> c[a] = 3 >>> a.cnt = c >>> b =pickle.loads(pickle.dumps(a)) >>> b in b.cnt True > Pickle could try to fix this by detecting when reduce returns a class type > as the first tuple arg and move the dict ctor parameter to the state, but > this may not always be intended. It's also a bit strange that __getstate__ > is never used anywhere in pickle directly. > I would advise against any such change. The reduce protocol is already fairly complex. Further I don't think change it this way would give us any extra flexibility. The documentation has a good explanation of how __getstate__ works under hood: http://docs.python.org/py3k/library/pickle.html#pickling-class-instances And if you need more, PEP 307 (http://www.python.org/dev/peps/pep-0307/) provides some of the design rationales of the API.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com