On Wed, Dec 12, 2018 at 12:50:41PM +1300, Greg Ewing wrote: > Steven D'Aprano wrote: > >The iterator protocol is that iterators must: > > > >- have a __next__ method; > >- have an __iter__ method which returns self; > > > >and the test for an iterator is: > > > > obj is iter(obj) > > By that test, it identifies as a sequence, as does testing it > for the presence of __len__:
Since existing map objects are iterators, that breaks backwards compatibility. For code that does something like this: if obj is iter(obj): process_iterator() else: n = len(obj) process_sequence() it will change behaviour, shifting map objects from the iterator branch to the sequence branch. That's a definite change in behaviour, which alone could change the meaning of the code. E.g. if the two process_* functions use different algorithms. Or it could break the code outright, because your MapView objects can raise TypeError when you call len() on them. I know that any object with a __len__ could in principle raise TypeError. But for anything else, we are justified in calling it a bug in the __len__ implementation. You're trying to sell it as a feature. > >>> m is iter(m) > False > >>> hasattr(m, '__len__') > True > > So, code that doesn't know whether it has a sequence or iterator > and tries to find out, will conclude that it has a sequence. > Presumably it will then proceed to treat it as a sequence, which > will work fine. It will work fine, unless something has called __next__, which will cause len() to blow up in their face by raising TypeError. I call these sorts of designs "landmines". They're absolutely fine, right up to the point where you hit the right combination of actions and step on the landmine. For anything else, this sort of thing would be a bug. You're calling it a feature. > >py> x = MapView(str.upper, "abcdef") # An imposter. > >py> next(x) > >'A' > >py> next(x) > >'B' > >py> next(iter(x)) > >'A' > > That's a valid point, but it can be fixed: > > def __iter__(self): > return self.iterator or map(self.func, *self.args) > > Now it gives > > >>> next(x) > 'A' > >>> list(x) > [] > > There is still one case that will behave differently from the > current map(), i.e. using list() first and then expecting it > to behave like an exhausted iterator. I'm finding it hard to > imagine real code that would depend on that behaviour, though. That's not the only breakage. This is a pattern which I sometimes use: def test(iterator): # Process items up to some symbol one way, # and items after that symbol another way. for a in iterator: print(1, a) if a == 'C': break # This relies on iterator NOT resetting to the beginning, # but continuing from where we left off # i.e. not being broken for b in iterator: print(2, b) Being an iterator, right now I can pass map() objects directly to that code, and it works as expected: py> test(map(str.upper, 'abcde')) 1 A 1 B 1 C 2 D 2 E Your MapView does not: py> test(MapView(str.upper, 'abcde')) 1 A 1 B 1 C 2 A 2 B 2 C 2 D 2 E This is why such iterators are deemed to be "broken". > > whether operations succeed or not depend on the > >order that you call them: > > > >py> x = MapView(str.upper, "abcdef") > >py> len(x)*next(x) # Safe. But only ONCE. > > But what sane code is going to do that? You have an object that supports len() and next(). Why shouldn't people use both len() and next() on it when both are supported methods? They don't have to be in a single expression: x = MapView(blah blah blah) a = some_function_that_calls_len(x) b = some_function_that_calls_next(x) That works. But reverse the order, and you step on a landmine: b = some_function_that_calls_next(x) a = some_function_that_calls_len(x) The caller may not even know that the functions call next() or len(), they could be implementation details buried deep inside some library function they didn't even know they were calling. Do you still think that it is the caller's code that is insane? > Remember, the iterator > interface is only there for backwards compatibility. Famous last words. > That would fail under both Python 2 and the current Python 3. Honestly Greg, you've been around long enough that you ought to recognise *minimal examples* for what they are. They're not meant to be real-world production code. They're the simplest, most minimal example that demonstates the existence of a problem. The fact that they are *simple* is to make it easy to see the underlying problem, not to give you an excuse to dismiss it. You're supposed to imagine that in real-life code, the call to next() could be buried deep, deep, deep in a chain of 15 function calls in some function in some third party library that I don't even know is being called, and it took me a week to debug why len(obj) would sometimes fail mysteriously. The problem is not the caller, or even the library code, but that your class magically and implictly swaps from a sequence to a pseudo-iterator whether I want it to or not. A perfect example of why DWIM code is so hated: http://www.catb.org/jargon/html/D/DWIM.html > >py> def innocent_looking_function(obj): > >... next(obj) > >... > >py> x = MapView(str.upper, "abcdef") > >py> len(x) > >6 > >py> innocent_looking_function(x) > >py> len(x) > >TypeError: Mapping iterator has no len() > > If you're using len(), you clearly expect to have a sequence, > not an iterator, so why are you calling a function that blindly > expects an iterator? *Minimal example* again. You ought to be able to imagine the actual function is fleshed out, without expecting me to draw you a picture: if hasattr(obj, '__next__'): first = next(obj, sentinel) Or if you prefer: try: first = next(obj) except TypeError: # fall back on sequence algorithm except StopIteration: # empty iterator None of this boilerplate adds any insight at all to the discussion. There's a reason bug reports ask for minimal examples. The point is, I'm calling some innocent looking function, and it breaks my sequence: len(obj) worked before I called the function, and afterwards, it raises TypeError. I wouldn't have to care about the implementation if your MapView object didn't magically flip from sequence to iterator behind my back. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/