On Tue, 12 Oct 2021 at 12:50, Chris Angelico <ros...@gmail.com> wrote:
>
> On Tue, Oct 12, 2021 at 10:24 PM Oscar Benjamin
> <oscar.j.benja...@gmail.com> wrote:
> >
> > On Tue, 12 Oct 2021 at 11:48, Chris Angelico <ros...@gmail.com> wrote:
> >>
> >> ValueError is no safer. The first() function would have, as its API,
> >> "returns the first element or raises ValueError if there is none". So
> >> now the caller of first() has to use try/except to handle the case
> >> where there is no value. Failing to do so is *just as buggy* as
> >> leaking a StopIteration.
> >>
> >> A leaky StopIteration is a majorly confusing bug inside a __next__
> >> function, because StopIteration is part of that function's API.
> >
> > On the contrary: a __next__ function is the only place where it could 
> > possibly be valid to raise StopIteration. The fact that next raises 
> > StopIteration which passes through to the caller can be useful in this 
> > situation and this situation alone:
> > https://github.com/python/cpython/blob/b37dc9b3bc9575adc039c6093c643b7ae5e917e1/Lib/csv.py#L111
> >
> > In any other situation it would be better to call first() and have 
> > something like ValueError instead.
> >
>
> Yes, but that's an example of __next__ specifically chaining to next()
> - exactly like defining __getattr__ to look for an attribute of
> something else (maybe you're writing a proxy of some sort). You expect
> that a bubbling-up exception is fundamentally equivalent to one you
> raise yourself.
>
> Please give a real example of where calling first() and getting
> ValueError is safer than calling next(iter(x)) and getting
> StopIteration. So far, I am undeterred in believing that the two
> exceptions have equivalent effect if the caller isn't expecting them.

I think that the situation where I first came across this was
something analogous to wanting to separate the header line of a CSV
file:

csvfiles = [
    ['name', 'joe', 'dave'],
    ['name', 'steve', 'chris'],
    [],  # whoops, empty csv file
    ['name', 'oscar'],
]

def remove_header(csvfile):
    it = iter(csvfile)
    next(it)
    return it

# print all names from all csv files
for names in map(remove_header, csvfiles):
    for name in names:
        print(name)

If you run the above you get

$ python t.py
joe
dave
steve
chris

The data following the empty file (i.e. "oscar") was silently
discarded. The context where I found this was something that took much
longer to run and was harder to check and debug etc. I have not
personally made the same mistake again because I have since been
automatically wary of any usage of next(). I couldn't possibly count
the number of times I've seen unsafe usage of next in code suggestions
on mailing lists like this though (see all the examples above in this
thread).

The problem is that the erroneous case which is the empty file leads
to a StopIteration. Unlike a normal exception though, a StopIteration
can be caught without try/except. In the above it is the *for-loop*
that swallows the exception. Had it been literally any other exception
type then you would have been looking at a Traceback instead of
silently discarded data:

$ python t.py
joe
dave
steve
chris
Traceback (most recent call last):
  File "t.py", line 21, in <module>
    for names in map(remove_header, csvfiles):
  File "t.py", line 18, in remove_header
    first(it)
  File "t.py", line 14, in first
    raise ValueError from None
ValueError

Your suggestion is that this is a bug in map() which is a fair
alternative view. Following through to its conclusion your suggestion
is that every possible function like map, filter, and all the iterator
implementations in itertools and in the wild should carefully wrap any
internal non-next function call in try/except to change any potential
StopIteration into a different exception type.

My view is that it would be better to have a basic primitive for
getting an element from an iterable or for advancing an iterator that
does not raise StopIteration in the first place. I would probably call
that function something like "take" rather than "first" though. The
reason I prefer introducing an alternative to next() is because I
think that if both primitives were available then in the majority of
situations next() would not be the preferred option.

--
Oscar
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SYXMROAL4B5GR5NQ5VHLXDU4IXYNS4UV/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to