[issue23910] C implementation of namedtuple (WIP)

Raymond Hettinger Sun, 26 Apr 2015 14:34:52 -0700

Raymond Hettinger added the comment:

FWIW, the current property(itemgetter(index)) code has a Python creation step, 
but the actual attribute lookup and dispatch is done entirely in C (no pure 
python steps around the eval lookup).


Rather than making a user visible C hack directly to namedtuple, any 
optimization effort should be directly at improving the speed of property() and 
itemgetter().

Here are some quick notes to get you started:

* The overhead of property() is almost nothing.
* The code for property_descr_get() in Objects/descrobject.c
* It has two nearly zero cost predictable branches
* So the the only real work is a call to 
  PyObject_CallFunctionObjArgs(gs->prop_get, obj, NULL);
* which then calls both
       objargs_mktuple(vargs) 
  and 
       PyObject_Call(callable, args, NULL);
* That can be sped-up by using 
       PyTuple_New(1)
  and a direct call to  PyObject_Call()
* The logic in PyObject_Call can potentially be tightened
  in the context of a property(itemgetter(index)) call.
  Look to see whether recursion check is necessary
  (itemgetter on a tuple subclass that doesn't extend __getitem__
   is non-recursive)
* If so, then entire call to PyObject_Call() in property 
  can potentially be simplified to:
   result = (*call)(func, arg, kw);

I haven't looked too closely at this, but I think you get the general idea.  If 
the speed of property(itemgetter(index)) is the bottleneck in your code, the 
starting point is to unwind the exact series of C steps performed to see if any 
of them can be simplified.  For the most part, the code in property() and 
itemgetter() were both implemented using the simplest C parts of the C API 
rather than the fastest.  The chain of calls isn't specialized for the common 
use case (i.e. property_get() needing exactly 1 argument rather than a variable 
length arg tuple and itemgetter doing a known integer offset on a list or tuple 
rather than the less common case of generic types and a possible tuple of 
indices).   

We should start by optimizing what we've already got.  That will have a benefit 
beyond named tuples (faster itemgetters for sorting and faster property gets 
for the entire language).  

It also helps us avoid making the NT code less familiar (using custom private 
APIs rather than generic, well-known components).  

It also reduces the risk of breaking code that relies on the published 
implementation of named tuple attribute lookups (for example, I've seen 
deployed code that customizes the attribute docstrings like this):
   Point = namedtuple('Point', ['x', 'y']) 
   Point.x = property(Point.x.fget, doc='abscissa')
   Point.y = property(Point.y.fget, doc='ordinate')
   coordinate = Point(x=250, y=43)

----------
priority: normal -> low

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23910>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23910] C implementation of namedtuple (WIP)

Reply via email to