New submission from Larry Hastings:

This probably shouldn't be checked in.  But it was an interesting experiment, 
and I did get it to work.

My brother forwarded me this question from Stack Overflow:
    
http://stackoverflow.com/questions/23453133/is-there-a-reason-python-3-enumerates-slower-than-python-2

The debate brought up a good point: a lot of the overhead of range() is in 
creating and destroying the long objects.  I wondered, could I get rid of that? 
 Long story short, yes.

rangeiterobject is a special-case range object for when you're iterating over 
integer values and the values all fit inside C longs.  Otherwise it has to use 
the much slower general-purpose longrangeiterobject.)  rangeiter_next is 
simple: it computes the new value then returns PyLong_FromLong of that value.

First thought: cache a reference to the previous value.  If its reference count 
is 1, we have the only reference.  Overwrite its value and return it.  But that 
doesn't help in the general case, because in "for x in range(1000)" x is 
holding a reference at the time __next__ is called on the iterator.

The trick: cache *two* old yielded objects.  In the general case, by the second 
iteration, everyone else has dropped their references to the older cached 
object and we can modify it safely.

The second trick: if the value you're yielding is one of the interned "small 
ints", you *have* to return the interned small int.  (Otherwise you break 0 == 
0, I kid you not.)

With this patch applied all regression tests pass.  And, on my desktop machine, 
the benchmark they used in the above link:

./python -mtimeit -n 5 -r 2 -s"cnt = 0" "for i in range(10000000): cnt += 1"

drops from 410ms to 318ms ("5 loops, best of 2: 318 msec per loop").


This implementation requires the rangeiterobject to have intimate knowledge of 
the implementation of the PyLongObject, including copy-and-pasting some 
information that isn't otherwise exposed (max small int essentially).  At the 
very least that information would need to be exposed properly so 
rangeiterobject could use it correctly before this could be checked in.

It might be cleaner for longobject.c to expose a private "set this long object 
to this value" function.  It would fail if we can't do it safely: if the value 
was an interned (small) int, or if the long was of the wrong size (Py_SIZE(o) 
!= 1).


Is this interesting enough to pursue?  I'm happy to drop it.

----------
assignee: larry
components: Interpreter Core
files: larry.range.hack.1.txt
messages: 242685
nosy: larry, pitrou, rhettinger, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Speed up range() by caching and modifying long objects
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file39307/larry.range.hack.1.txt

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24138>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to