On 09/03/2012 09:26 PM, Steven D'Aprano wrote: > On Mon, 03 Sep 2012 13:04:23 -0700, Aaron Brady wrote: > >> <snip> >> >> I think the technique would require two python longs and a bool in the >> set, and a python long in the iterator. >> >> One long counts the number of existing (open) iterators. Another counts >> the version. The bool keeps track of whether an iterator has been >> created since the last modification, in which case the next modification >> requires incrementing the version and resetting the flag. > I think that is over-engineered and could be the difference between > having the patch accepted and having it rejected. After all, many people > will argue that the existing solution to the problem is good enough. > > Dicts are extremely common, and your patch increases both the memory > usage of every dict, and the overhead of every write operation > (__setitem__, __delitem__, update). Only a very few dicts will actually > need this overhead, for the rest it is waste. It is important to keep > that waste to a minimum or risk having the patch rejected. > > An unsigned C int can count up to 4,294,967,295. I propose that you say > that is enough iterators for anyone, and use a single, simple, version > counter in the dict and the iterator. If somebody exceeds that many > iterators to a single dict or set,
I think you have the count confused. it has to be a count of how many changes have been made to the dict or set, not how many iterators exist. > and the version field overflows by > exactly 2**32 versions, the results are no worse than they are now. You > won't be introducing any new bugs. > > Complicating the solution is, in my opinion, unnecessary. Why should > every set and dict carry the cost of incrementing TWO Python longs and a > flag when just a single C int is sufficient for all realistic use-cases? > > The opportunity for failure is extremely narrow: > > - I must have an iterator over a dict or set; > - and I must have paused the iteration in some way; > - and then I must create exactly 2**32 other iterators to the > same dict or set; > - and at some point modify the dict or set > - and then restart the first iterator > > at which point some items returned by the iterator *may* be duplicated or > skipped (depends on the nature of the modifications). > > I agree with almost your entire point, exact that the place where it would fail to detect a bug is when somebody has modified the dict exactly 2**32 times while an iterator is paused. See my own response, of 4:27pm. (my time). -- DaveA -- http://mail.python.org/mailman/listinfo/python-list