Re: [Python-Dev] proposed attribute lookup optimization

2007-07-11 Thread Paul Pogonyshev
[I don't know why I didn't receive this mail, presumably spam filter
at gmx.net sucks as always]

Phillip J. Eby wrote:
> At 08:23 PM 7/8/2007 +0300, Paul Pogonyshev wrote:
> >I would like to propose an optimization (I think so, anyway) for the
> >way attributes are looked up.  [...]
> 
> [...]
>
> Again, though, this has already been proposed, and I believe there's 
> a patch awaiting review for inclusion in 2.6 (and presumably 3.0).

OK, good to know.  Of course it is better if done by someone familiar
with Python internals :)  After proposing this I decided it wasn't
worthwile, since it would require cache revalidation after any
assignment to a new class attribute.  But supposedly I just have
incorrect picture of what is often in Python :)

Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] proposed attribute lookup optimization

2007-07-08 Thread Phillip J. Eby
At 08:23 PM 7/8/2007 +0300, Paul Pogonyshev wrote:
>I would like to propose an optimization (I think so, anyway) for the
>way attributes are looked up.  Currently, it is done like this:
>
> return value of attribute in instance.__dict__ if present
> for type in instance.__class__.__mro__:
> return value of attribute in type.__dict__ if present
> raise AttributeError

Actually, it is only done like that for "classic" classes.  New-style 
classes actually work more like this:

 descriptor = None
 for t in type(ob).__mro__:
 if attribute in t.__dict__:
 descriptor = t.__dict__[attribute]
 if hasattr(descriptor, '__set__'):
 return descriptor.__get__(ob, type(ob))
 break

 if attribute in ob.__dict__: return ob.__dict__[attribute]
 if descriptor is not None: return descriptor.__get__(ob, type(ob))
 if hasattr(type(ob),'__getattr__'): return ob.__getattr__(attribute)
 raise AttributeError


>I propose adding to each type a C-implementation-private dictionary
>of attribute-name => type-in-which-defined.  Then, it will not be
>necessary to traverse __mro__ on each attribute lookup for names
>which are present in this lookup dictionary.

Sounds good to me...  but it's just as simple to store the 
descriptors directly, rather than the type that defines the 
descriptor.  Might as well cut out the middleman.

I believe that someone proposed this already, with a patch, in fact...


>This optimization will not have any effect for attributes defined
>on instance.

It will for new-style classes, actually -- and a significant one if 
the inheritance  hierarchy is deep and doesn't contain a default 
value for the attribute.


>   It will, however, for type attributes, most notably
>for methods.

Yep.  It'll also speed up access to inherited slots.


>   It will most likely cause a slowdown for looking up
>attributes that are defined directly on self.__class__, not on any
>of its bases.

Not if it's a direct cache of descriptors; in that case it will have 
no effect on lookup time.


>One open question is what to do in case an attribute on a type is
>set or deleted.

New-style classes can handle this easily; they know their subclasses 
and you can't directly write to a new-style class' __dict__.  So when 
you set or delete an attribute on a type, it's possible to walk the 
subclasses and update their caches accordingly.  I believe Python 
already does this so that if you e.g. set 'sometype.__call__ = 
something', then all the subclasses' C-level tp_call slots get 
changed to match.  The same approach could be used for caching on 
new-style classes.

Again, though, this has already been proposed, and I believe there's 
a patch awaiting review for inclusion in 2.6 (and presumably 3.0).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] proposed attribute lookup optimization

2007-07-08 Thread Paul Pogonyshev
Hi,

I would like to propose an optimization (I think so, anyway) for the
way attributes are looked up.  Currently, it is done like this:

return value of attribute in instance.__dict__ if present
for type in instance.__class__.__mro__:
return value of attribute in type.__dict__ if present
raise AttributeError

I propose adding to each type a C-implementation-private dictionary
of attribute-name => type-in-which-defined.  Then, it will not be
necessary to traverse __mro__ on each attribute lookup for names
which are present in this lookup dictionary.

This optimization will not have any effect for attributes defined
on instance.  It will, however, for type attributes, most notably
for methods.  It will most likely cause a slowdown for looking up
attributes that are defined directly on self.__class__, not on any
of its bases.  However, I believe it will be a benefit for all
non-extremely shallow inheritance tree.  Especially if they involve
multiple inheritance.

One open question is what to do in case an attribute on a type is
set or deleted.

Python example:

class Current (type):

@staticmethod
def getattribute (self, name):
dict = object.__getattribute__(self, '__dict__')
if name in dict:
return dict[name]

mro = object.__getattribute__ (self, '__class__').__mro__
for type in mro:
dict = type.__dict__
if name in dict:
return dict[name]

raise AttributeError

def __init__(self, name, bases, dict):
super (Current, self).__init__(name, bases, dict)
self.__getattribute__ = Current.getattribute


class Optimized (type):

@staticmethod
def getattribute (self, name):
dict = object.__getattribute__(self, '__dict__')
if name in dict:
return dict[name]

# 
lookup = object.__getattribute__ (self, '__class__').__lookup_cache__
if name in lookup:
return lookup[name].__dict__[name]
# 

mro = object.__getattribute__ (self, '__class__').__mro__
for type in mro:
dict = object.__getattribute__(type, '__dict__')
if name in dict:
return dict[name]

raise AttributeError

# 
def build_lookup_cache (self):
lookup = {}
for type in self.__mro__:
for name in type.__dict__:
if name not in lookup:
lookup[name] = type

return lookup
# 

def __init__(self, name, bases, dict):
super (Optimized, self).__init__(name, bases, dict)
# 
self.__lookup_cache__ = self.build_lookup_cache ()
# 
self.__getattribute__ = Optimized.getattribute


class A (object):
__metaclass__ = Optimized
x = 1

class B (A):
pass

class C (B):
pass

class D (C):
pass

class E (D):
pass

t = E ()

for k in xrange (10):
t.x

Try swapping metaclass of A from Optimized to Current and measure
execution time.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com