[issue14288] Make iterators pickleable
Changes by Kristján Valur Jónsson krist...@ccpgames.com: -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Good idea Antoine. So, I'll with your suggested fix to the unittests I'll commit this and then look on while Rome burns. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 4ff234337e24 by Kristján Valur Jónsson in branch 'default': Issue #14288: Serialization support for builtin iterators. http://hg.python.org/cpython/rev/4ff234337e24 New changeset 51c88d51aa4a by Kristján Valur Jónsson in branch 'default': Issue #14288: Modify Misc/NEWS http://hg.python.org/cpython/rev/51c88d51aa4a -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Okay, I'll go ahead, fix the 'iter()' trick api name and apply the patch. Then we'll see what happens :). Any suggestion towards what documentation changes are needed? I don't think the list of pickleable objects is made explicit anywhere, it is largely a trial and error thing. But obviously Misc/News is the prime candidate. Thanks -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Antoine Pitrou pit...@free.fr added the comment: Okay, I'll go ahead, fix the 'iter()' trick api name and apply the patch. Then we'll see what happens :). Please wait for reviews. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Raymond had already reviewed it, and sbt. I wasn't aware of any more pending reviews, but I'll wait for yours, of course. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Antoine Pitrou pit...@free.fr added the comment: Well, please take a look at the review link. There are already some comments there. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Btw, regarding compatibility: The docs say The pickle serialization format is guaranteed to be backwards compatible across Python releases. I take this to mean the serialization format itself. I don't think there is a broader guarantee that pickles generated by one version can be read by another, since objects can change their internal representation, types can even disappear or change. In that sense, pickles made by 2.7 can be read by 3.2, in the sense that they will correctly return an error when they can't construct a 'str' object. If I misunderstand things, then at least I think that the pickle documentation should be made clearer, that not only is the protocol supposed to be read, but also that entire pickles should always be readable _and_ instantiatable, by future versions -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: I've incorporated antoine's comments and the proposed internal function name into a new patch. A lot of the changes concerned thecking the type() of the unpickled iterator. Now, it wasn't a specific design goal to get the exact same objects back, only _equivalent_ objects. In particular, dicts and sets have problems, so a dictiter to a partially consumed dict cannot be pickled as it is. so, I've added type cases everywhere, but for those cases. Now, how important do you think type consistency is? when using iterators, does one ever look at it and test its type? if this is important, I _could_ take another look at dicts and seta and create fresh iterators to the dicts and sets made out of the remainder of the items, rather than iterators to lists. Any thoughs? -- Added file: http://bugs.python.org/file25042/pickling2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Antoine Pitrou pit...@free.fr added the comment: Now, how important do you think type consistency is? when using iterators, does one ever look at it and test its type? if this is important, I _could_ take another look at dicts and seta and create fresh iterators to the dicts and sets made out of the remainder of the items, rather than iterators to lists. I think type consistency is important if it can be achieved reasonably simply. In the dict and set case, I'm not sure you can recreate the internal table in the same order (even accross interpreter restarts). In this case, you should just check that the unpickled data is a subclass of collections.abc.Iterator. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
sbt shibt...@gmail.com added the comment: ... and that pickling things like dict iterators entail running the iterator to completion and storing all of the results in a list. The thing to emphasise here is that pickling an iterator is destructive: afterwards the original iterator will be empty. I can't think of any other examples where pickling an object causes non-trivial mutation of that object. Come to think of it, doesn't copy.copy() delegate to __reduce__()/__reduce_ex__(). It would be a bit surprising if copy.copy(myiterator) were to consume myiterator. I expect copy.copy() to return an independent copy without mutating the original object. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Antoine Pitrou pit...@free.fr added the comment: The thing to emphasise here is that pickling an iterator is destructive: afterwards the original iterator will be empty. If you look at the patch it isn't (or shouldn't be). I agree with Raymond that accumulating dict and set iterators in a list is a bit weird. That said, with hash randomization, perhaps we can't do any better (the order of elements in the internal table depends on the process-wide hash seed). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
sbt shibt...@gmail.com added the comment: If you look at the patch it isn't (or shouldn't be). Sorry. I misunderstood when Raymond said running the iterator to completion. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: rhettinger - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: sbt, I will fix the api name. Any other objections then? Leave it as it is with the iter() trick? -- versions: +Python 3.3 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Raymond Hettinger raymond.hettin...@gmail.com added the comment: Has python-dev discussion been launched? It is far from clear that this is worth doing. Pickling runtime structures may be a normal use case for Stackless but isn't a normal use case for regular Python. Also, it seems pointless to start down this path because it will always be incomplete (i.e. pickling running generators, or socket streams, etc). It also seems to be at odds with the normal use case for passing around partially consumed iterators -- the laziness and memory friendliness is a desired feature; however, the patch pickles some iterators (such as dicts) by running them to completion and storing *all* of the results that would have been returned. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Martin v. Löwis mar...@v.loewis.de added the comment: I think the worth doing argument doesn't really hold, given that it's done. The question at hand really is a) is the patch correct? b) can we commit to maintaining it, even as things around it may change? I'm not bothered with the patch being potentially incomplete: anybody wishing to pickle more things should contribute patches for that. As for a), I think we should give people some time to review, and then wait for beta releases to discover issues. If this is a rarely-used feature, the world won't end if it has bugs. As for b), I think the main issue is forward compatibility: will pickles created by 3.3 still be readable by future versions? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Michael Foord mich...@voidspace.org.uk added the comment: Yes there was a discussion on python-dev. Various people spoke in favour, no-one against: http://mail.python.org/pipermail/python-dev/2012-March/117566.html -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Raymond Hettinger raymond.hettin...@gmail.com added the comment: Michael, thanks for the link. The email was clearer about its rationale than was listed here. When this patch gets applied, any discussion of it in the docs should be clear that generators aren't included and that pickling things like dict iterators entail running the iterator to completion and storing all of the results in a list. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Georg Brandl ge...@python.org added the comment: The review link next to the the patch file entry should already work and provide a nice visual diff + commenting interface. -- nosy: +georg.brandl ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Raymond Hettinger raymond.hettin...@gmail.com added the comment: The dict iterators depend on the order of the dict being the same when unpickled on another python (the order will vary depending on dummy entries, insertion order, 32 vs 64 bit builds, salted hashes, etc). Sets have the same issue -- it doesn't seem possible to pickle a set iterator in a semi-consumed state without being able to reproduce the underlying unordered collection in *exactly* the same order and being able to point the resumed iterators to the correct part of memory. Any hacks to make this appear to work would like be hard to reproduce across different implementations of Python (i.e. Jython's dicts are based on Java's concurrent mappings). There isn't a provision for saving and restoring running generators. There isn't a provision for iterators created using iter(func, sentinel) where successive func calls change state. I don't see how str iterators remember where they left off. Note, the prior effort to make iterators copyable was a failure. It was difficult to do in the general case and the cases we did provide had zero uptake (i.e. they were never used). ISTM, that pickling iterators faces the same issues. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- Removed message: http://bugs.python.org/msg155759 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
sbt shibt...@gmail.com added the comment: I think PyAPI_FUNC(PyObject *) _PyIter_GetIter(const char *iter); has a confusing name for a convenience function which retrieves an attribute from the builtin module by name. Not sure what would be better. Maybe _PyIter_GetBuiltin(). -- nosy: +sbt ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: another trick has been suggested. For hidden iterator objects, such as stringiter, to actually put them in the types module. in there, we could do something like: #types.py stringiter = iter('').__class__ and we would then change the name of the iterator in c to be types.stringiter. How does that sound? It _does_ make it necessary for the types module to be there to help with pickling. The _proper_ fix would be for e.g. stringiter to live in builtins, next to 'str' that it is iterating over. Any thoughts? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: - rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
New submission from Kristján Valur Jónsson krist...@ccpgames.com: A common theme in many talks last year about cloud computing was the need to suspend execution, pickle state, and resume it on a different node. This patch is the result of last year's stackless sprint at pycon, finally completed and submitted for review. Python does not currently support pickling of many run-time structures, but pickling for things like iterators is trivial. A large piece of Stackless' branch is to make sure that various run-time constructs are pickleable, including function objects. While this patch does not do that, it does add pickling for dictiter, and the lot. This makes it possible to have compilcated data sets, iterate through them, and pickle them in a semi-consumed state. Please note that a slight hack is needed to pickle some iterators. Many of these classes are namely hidden and there is no way to access their constructors by name. instead, an unpickling trick is to invoke iter on an object of the target type instead. Not the most elegant solution but I didn't want to complicate matters by adding iterator classes into namespaces. Where should stringiter live for example? Be a builtin like str? We also didn't aim to make all iterators copy.copy-able using the __reduce__ protocol. Some iterators actually use internal iterators themselves, and if a (non-deep) copy were to happen, we would have to shallow copy those internal objects. Instead, we just return the internal iterator object directly from __reduce__ and allow recursive pickling to proceed. -- files: pickling.patch keywords: patch messages: 155626 nosy: krisvale, loewis, michael.foord priority: normal severity: normal status: open title: Make iterators pickleable type: enhancement versions: Python 3.4 Added file: http://bugs.python.org/file24822/pickling.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Raymond Hettinger raymond.hettin...@gmail.com added the comment: ISTM that a discussion on python-dev would be of value here. Iterators are a protocol, not a class; hence, the effort to make them all picklable is potentially endless. The effort would also always be incomplete because some iterators are difficult or inconvenient to pickle (esp. those with inputs from local or temporary resources). Experience with SQL hasn't shown a need to save partially consumed cursors and my experience with iterators indicates that the need for pickling would be rare or that is would distract from better solutions (perhaps message based or somesuch). The size of this patch is a hint that the idea is not a minor change and that it would add a maintenance burden. What is far from clear is whether it would have value for any real-world problems. -- nosy: +rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Sure, I'll start a discussion there, but at least I've gotten the patch in. The patch is smaller than it looks, most of it is tests. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Raymond Hettinger raymond.hettin...@gmail.com added the comment: The patch looks fine. If python-dev thinks that making-iterators-picklable is worth doing, I would support this going into Python 3.3 (no extra benefit will come from waiting). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14288] Make iterators pickleable
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Btw, is there some way I can make this patch easier to review? I haven't contributed much since the Hg switchover, can I make it so that people can do visual diff? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com