[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip

Brandt Bucher Thu, 14 May 2020 11:16:29 -0700

Ethan Furman wrote:
> So half of your examples are actually counter-examples.

I claimed to have found "dozens of other call sites in Python's standard 
library and tooling where it would be appropriate to enable this new feature". 
You asked for references, and I provided two dozen cases of zipping what must 
be equal length iterables.

I said they were "appropriate", not "needed" or even "recommended". These are 
call sites where unequal-length iterables, if encountered, would be an error 
that I would hope wouldn't pass silently. Besides, I don't think it's beyond 
the realm of imagination for a future refactoring of several of the "Mismatch 
cannot happen." cases to introduce a bug of this kind.

> Did you vet them, or just pick matches against `zip(`?

Of course. I spent hours vetting them, to the point of researching the GNU tar 
extended sparse header and Apple property list formats (and trying to figure 
out what the hell was happening in `os._fwalk`) just to make sure my 
understanding was correct.

Ethan Furman wrote:
> Not the call itself, but the running of zip.  Absent some clever programming 
> it seems to me that there are two choices if we have a flag:

I wouldn't call my implementation "clever", but it differs from both of these 
options.  We only need to check if we're strict when an error occurs in one of 
our iterators, which is a situation the C code for `zip` already needs to 
explicitly handle with a branch. So this condition is only hit on the "last" 
`__next__` call, not on every single iteration.

As a reminder, the actual C implementation is linked in the PEP (there's no PR 
yet but branch reviews are welcome), though I'd prefer if the PEP discussion 
didn't get bogged down in those specifics.  The pure-Python implementation in 
the PEP is *very* close to it, but it uses different abstractions for some of 
the details regarding error handling and argument parsing.[0]

However, for those who are interested, there is no measurable performance 
regression (and no additional parsing overhead for no-keyword-argument calls). 
Parsing the keyword argument (if present) adds <0.2us of overhead at creation 
time on my machine. I went ahead and ran some rough PGO/LTO benchmarks:

Creation time:

```

$ ./python-master     -m pyperf timeit 'zip()'
Mean +- std dev: 79.4 ns +- 4.3 ns 
$ ./python-zip-strict -m pyperf timeit 'zip()'
Mean +- std dev: 79.0 ns +- 1.9 ns
$ ./python-zip-strict -m pyperf timeit 'zip(strict=True)'
Mean +- std dev: 240 ns +- 8 ns

```

Creation time + iteration time:

```

$ ./python-master     -m pyperf timeit -s 'r = range(10)' '[*zip(r, r)]'
Mean +- std dev: 577 ns +- 35 ns
$ ./python-zip-strict -m pyperf timeit -s 'r = range(10)' '[*zip(r, r)]'
Mean +- std dev: 565 ns +- 16 ns
$ ./python-zip-strict -m pyperf timeit -s 'r = range(10)' '[*zip(r, r, 
strict=True)]'
Mean +- std dev: 756 ns +- 27 ns

$ ./python-master     -m pyperf timeit -s 'r = range(100)' '[*zip(r, r)]'
Mean +- std dev: 3.54 us +- 0.14 us
$ ./python-zip-strict -m pyperf timeit -s 'r = range(100)' '[*zip(r, r)]'
Mean +- std dev: 3.49 us +- 0.07 us
$ ./python-zip-strict -m pyperf timeit -s 'r = range(100)' '[*zip(r, r, 
strict=True)]'
Mean +- std dev: 3.73 us +- 0.13 us

$ ./python-master     -m pyperf timeit -s 'r = range(1000)' '[*zip(r, r)]'
Mean +- std dev: 44.1 us +- 2.0 us
$ ./python-zip-strict -m pyperf timeit -s 'r = range(1000)' '[*zip(r, r)]'
Mean +- std dev: 45.2 us +- 2.0 us
$ ./python-zip-strict -m pyperf timeit -s 'r = range(1000)' '[*zip(r, r, 
strict=True)]'
Mean +- std dev: 45.2 us +- 1.4 us

```

Additionally, the size of a `zip` instance has not changed.  Pickles for 
non-strict `zip` instances are unchanged as well.

Brandt

[0] And zip's current tuple caching, which is *very* clever.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZVCBS2S7IRIEU346AAJUEVH45VNMVOKI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip

Reply via email to