[issue24138] Speed up range() by caching and modifying long objects

Larry Hastings Wed, 06 May 2015 09:20:18 -0700

New submission from Larry Hastings:

This probably shouldn't be checked in.  But it was an interesting experiment, 
and I did get it to work.

My brother forwarded me this question from Stack Overflow:

http://stackoverflow.com/questions/23453133/is-there-a-reason-python-3-enumerates-slower-than-python-2

The debate brought up a good point: a lot of the overhead of range() is in
creating and destroying the long objects. I wondered, could I get rid of that?
Long story short, yes.

rangeiterobject is a special-case range object for when you're iterating over
integer values and the values all fit inside C longs. Otherwise it has to use
the much slower general-purpose longrangeiterobject.) rangeiter_next is
simple: it computes the new value then returns PyLong_FromLong of that value.

First thought: cache a reference to the previous value. If its reference count
is 1, we have the only reference. Overwrite its value and return it. But that
doesn't help in the general case, because in "for x in range(1000)" x is
holding a reference at the time __next__ is called on the iterator.

The trick: cache *two* old yielded objects. In the general case, by the second
iteration, everyone else has dropped their references to the older cached
object and we can modify it safely.

The second trick: if the value you're yielding is one of the interned "small
ints", you *have* to return the interned small int. (Otherwise you break 0 ==
0, I kid you not.)

With this patch applied all regression tests pass. And, on my desktop machine,
the benchmark they used in the above link:

./python -mtimeit -n 5 -r 2 -s"cnt = 0" "for i in range(10000000): cnt += 1"

drops from 410ms to 318ms ("5 loops, best of 2: 318 msec per loop").

This implementation requires the rangeiterobject to have intimate knowledge of
the implementation of the PyLongObject, including copy-and-pasting some
information that isn't otherwise exposed (max small int essentially). At the
very least that information would need to be exposed properly so
rangeiterobject could use it correctly before this could be checked in.

It might be cleaner for longobject.c to expose a private "set this long object
to this value" function. It would fail if we can't do it safely: if the value
was an interned (small) int, or if the long was of the wrong size (Py_SIZE(o)
!= 1).

Is this interesting enough to pursue? I'm happy to drop it.

----------
assignee: larry
components: Interpreter Core
files: larry.range.hack.1.txt
messages: 242685
nosy: larry, pitrou, rhettinger, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Speed up range() by caching and modifying long objects
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file39307/larry.range.hack.1.txt

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24138>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24138] Speed up range() by caching and modifying long objects

Reply via email to