On Tue, Oct 12, 2021 at 10:24 PM Oscar Benjamin
<oscar.j.benja...@gmail.com> wrote:
>
>
> On Tue, 12 Oct 2021 at 11:48, Chris Angelico <ros...@gmail.com> wrote:
>>
>> On Tue, Oct 12, 2021 at 8:43 PM Oscar Benjamin
>> <oscar.j.benja...@gmail.com> wrote:
>> > A leaky StopIteration can wreak all sorts of havoc. There was a PEP that 
>> > attempted to solve this by turning StopIteration into RuntimeError if it 
>> > gets caught in a generator but that PEP (which was rushed through very 
>> > quickly IIRC) missed the fact that generators are not the only iterators. 
>> > It remains a problem that leaking a StopIteration into map, filter etc 
>> > will terminate iteration of an outer loop.
>> >
>>
>> Generators are special because they never mention StopIteration. They
>> are written like functions, but behave like iterators. That is why
>> StopIteration leaking is such a problem.
>
>
> Generators are a common case and are important so the PEP definitely helps. 
> It is incomplete though because the problem remains for other cases. 
> StopIteration is rarely mentioned anywhere e.g. there is nothing about it in 
> the docstring for map:
> https://docs.python.org/3/library/functions.html#map

If you want to report it as a bug in map(), feel free to do so. It's
not a general issue to be solved. I would say that this version of
map() is naive, and that version is safe:

class map_naive:
    def __init__(self, func, it):
        self.func = func; self.it = iter(it)
    def __iter__(self): return self
    def __next__(self):
        return self.func(next(self.it))

class map_safe:
    def __init__(self, func, it):
        self.func = func; self.it = iter(it)
    def __iter__(self): return self
    def __next__(self):
        value = next(self.it)
        try: return self.func(value)
        except StopIteration: raise ValueError("StopIteration raised
by map function")

def map_alsosafe(func, it):
    for value in it: yield func(value)

The distinction between naive and safe is *inside the definition of
__next__*, and nowhere else. The fault isn't in the function that you
pass to map, any more than having it raise AttributeError would be a
fault. The reason generators are special is that, despite not having
__next__ visible anywhere, they still have that same consideration.
That's why they automatically transform StopIterations.

>> In every other situation, StopIteration is part of the API of what
>> you're working with. It is a bug to call next() without checking for
>> StopIteration (or knowingly and intentionally permitting it to
>> bubble).
>
>
> Exactly: simple usage of next is often a bug. We need to be careful about 
> this every time someone suggests that it's straight-forward to do 
> next(iter(obj)).

Yes, but "give me the first entry" is underspecified anyway. What
SHOULD happen if there is no first entry? Is ValueError particularly
different? If you do the naive thing and leak StopIteration, most
likely it'll end up on the console.

>> > The culprit for the problem of leaking StopIteration is next itself which 
>> > in the 1-arg form is only really suitable for use when implementing an 
>> > iterator and not for the much more common case of simply wanting to 
>> > extract something from an iterable. Numerous threads here and on 
>> > stackoverflow and elsewhere suggesting that you can simply use 
>> > next(iter(obj)) are encouraging bug magnet code. Worse, the bug when it 
>> > arises will easily manifest in something like silent data loss and can be 
>> > hard to debug.
>> >
>>
>> That's no worse than getattr() and AttributeError. If you call getattr
>> and you aren't checking for AttributeError, then you could be running
>> into the exact same sorts of problems, because AttributeError is part
>> of the function's API.
>
>
> The difference is that you usually don't try to catch AttributeError in a 
> higher up frame. A function that leaks StopIteration is not iterator-safe and 
> can not be used with functional iterator tools like map. The exact reason for 
> the danger of bare next is not obvious even to experienced Python 
> programmers. Before the discussions around the PEP I had pointed it out 
> several times and saw experienced commenters on lists like this being 
> confused about what exactly the problem was. Maybe I'm not good at explaining 
> myself but if the problem was obvious then it shouldn't have needed careful 
> explanation.
>

Nor do you usually catch StopIteration. There are very very few cases
where a StopIteration will silently truncate something, and they are
all cases where the function should probably be changed. In user code,
it's the rule of thumb that I described: be aware of StopIteration
when writing __next__ or calling next(), otherwise it shouldn't be a
problem.

The problem is most definitely NOT obvious, because most situations
are simply *not a problem*, and most of the ones that ARE a problem
would still be just as much of a problem with any other exception.

>> > The real advantage of providing first (or "take" or any of the other names 
>> > that have been proposed in the past) is that it should raise a different 
>> > exception like ValueError so that it would be safe to use by default.
>> >
>>
>> ValueError is no safer. The first() function would have, as its API,
>> "returns the first element or raises ValueError if there is none". So
>> now the caller of first() has to use try/except to handle the case
>> where there is no value. Failing to do so is *just as buggy* as
>> leaking a StopIteration.
>>
>> A leaky StopIteration is a majorly confusing bug inside a __next__
>> function, because StopIteration is part of that function's API.
>
>
> On the contrary: a __next__ function is the only place where it could 
> possibly be valid to raise StopIteration. The fact that next raises 
> StopIteration which passes through to the caller can be useful in this 
> situation and this situation alone:
> https://github.com/python/cpython/blob/b37dc9b3bc9575adc039c6093c643b7ae5e917e1/Lib/csv.py#L111
>
> In any other situation it would be better to call first() and have something 
> like ValueError instead.
>

Yes, but that's an example of __next__ specifically chaining to next()
- exactly like defining __getattr__ to look for an attribute of
something else (maybe you're writing a proxy of some sort). You expect
that a bubbling-up exception is fundamentally equivalent to one you
raise yourself.

Please give a real example of where calling first() and getting
ValueError is safer than calling next(iter(x)) and getting
StopIteration. So far, I am undeterred in believing that the two
exceptions have equivalent effect if the caller isn't expecting them.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2AL5FE3KZI4EBTRMJ7O5EL6MBVN7RUYF/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to