--- On Tue, 1/26/10, Stefan Behnel <stefan...@behnel.de> wrote:
> From: Stefan Behnel <stefan...@behnel.de>
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: python-dev@python.org
> Date: Tuesday, January 26, 2010, 1:27 AM
> Michael Foord, 26.01.2010 01:14:
> > How great is the complication? Making list.pop(0)
> efficient sounds like
> > a worthy goal, particularly given that the reason you
> don't use it is
> > because you *know* it is inefficient
> 
> I agree. Giving a programmer the insight that lists are
> implemented as
> arrays in CPython is essentially saying "don't use
> list.pop(0), it's
> SLOW!". So they won't use it, and even avoid it for small
> lists where it
> doesn't really matter. It basically stinks.
> 
> Making list.pop(0) fast removes another edge where
> programmers must
> prematurely concentrate on implementation specifics of the
> interpreter they
> use, rather than the functionality they want to implement.
> I think that's
> an improvement to the simplicity that Python provides. It's
> basically
> saying: "care about performance when you have to, but
> otherwise believe in
> the core developers to make your code fast enough in most
> common cases".
> 

Thanks! I agree with you 100%, of course.

In fairness, I think my patch will only negligibly affect the performance of 
almost every Python program in existence.  I don't see any likelihood of 
negative effects for real world programs, but I also concede that the positive 
effects will usually be drowned out by other performance bottlenecks.

So I'd like to push for the patch on the grounds that it simply relieves Python 
programmers from worrying about a needless implementation detail.  But I am 
also prepared to have it rejected on the simple principle of keeping the 
CPython implementation less complex.  I've strived to make the changes as 
uninvasive as possible, but there is no getting around 1) increasing the size 
of PyListObject by an int, and 2) touching every major function in listobject.c 
related to memory management (w/list_resize getting the most surgery).  So 
there are certainly tradeoffs, unless I am overlooking some more clever way to 
implement this.

The core piece of the change that I made was to make deleting elements off the 
top of the list more efficient, by advancing a single pointer forward instead 
of moving an entire chunk of memory backward.  An incidental change was then to 
make inserting elements at the top of the list try to reclaim empty slots.  I 
think the former use case is valid and reasonably common (for some definition 
of "common").  The latter use case is probably a lot more rare, but I 
implemented the insert optimization to avoid the spiky double-memmove that the 
delete optimization would have otherwise introduced to prepends that succeeded 
pops.

My current strategy for giving memory back to the OS is overly simple and needs 
refinement, but it basically says that the number of orphaned pointers cannot 
exceed the number of active elements in the list.  This might sound wasteful, 
but the reality is that N list elements consume a lot more memory than N 
orphaned pointers, and the greater attractiveness of list.pop(0) would of 
course lead to earlier garbage collection of the popped elements in real world 
programs.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to