Eugene Toder added the comment: To recap, the issue is that pickle doesn't handle recursion via reduce arguments (i.e. arguments to the constructor function as returned in 2nd element of the tuple from __reduce__). This leads to 2 kind of effects:
class C: def __init__(self, x=None): self.x = x if x is not None else self def __reduce__(self): return C, (self.x,) A. Recursion error: >>> pickle.dumps(C()) Traceback (most recent call last): File "<pyshell#5>", line 1, in <module> pickle.dumps(C()) RuntimeError: maximum recursion depth exceeded while calling a Python object This cannot be helped with the current reduce protocol. The error may be improved, but that's about it. B. Duplication of object when unpickling: >>> c = C([]) >>> c.x.append(c) >>> c.x[0] is c True >>> c2 = pickle.loads(pickle.dumps(c)) >>> c2.x[0] is c2 False This happens because list (or another recursion-friendly type) inside the problematic object handles recursion, but we still get the outer object, identical to the inner one. This can be solved the same way as for tuple: >>> t=([],1,2) >>> t[0].append(t) >>> t2 = pickle.loads(pickle.dumps(t)) >>> t2[0][0] is t2 True >>> pickletools.dis(pickle.dumps(t)) 0: \x80 PROTO 3 2: ] EMPTY_LIST 3: q BINPUT 0 5: h BINGET 0 7: K BININT1 1 9: K BININT1 2 11: \x87 TUPLE3 12: q BINPUT 1 14: a APPEND 15: K BININT1 1 17: K BININT1 2 19: 0 POP 20: 0 POP 21: 0 POP 22: h BINGET 1 24: . STOP After pickling its elements tuple checks if it got into memo. If it did, this means it was pickled by one of the elements, so it POPs all elements from the stack and fetches itself via GET. This is somewhat inefficient, but probably the best it can do. I suggest we do 3 things: 1. Improve the documentation for __reduce__ function. It should mention that all state that a) can potentially point back to the object and b) not strictly necessary in the constructor function should be passed via the 3rd element of __reduce__ tuple (aka state) instead of the 2nd element, and applied by __setstate__. This handles recursion in robust and optimal way. 2. Check that all built-in/standard types follow this advice. I see that Stefan Mihaila already fixed sets. 3. To fix case B above add the memo check from save_tuple to save_reduce. While at it, we can consider checking after pickling every element instead of after pickling all elements, so we reduce the number of POPs and the wastage they cause. ---------- nosy: +eltoder, pitrou _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1062277> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com