Greg Ewing wrote:
> Neal Norwitz wrote:
>> Part of this is done, but very differently in that all strings used in
>> code objects are interned (stored in a dictionary
> 
> And since two interned strings can be compared
> by pointer identity, I don't see how this differs
> significantly from the "unique integer" idea.
> 
> If the integers were used to directly index an
> array instead of being used as dict keys, it
> might make a difference. The cost would be that
> every namespace would need to be as big as
> the number of names in existence, with most
> of them being extremely sparse.

And we already have a data structure for sparse arrays... it's called a 
"dict". :)

If every attribute name were guaranteed to be an interned string (not 
currently the case - attribute names can be any kind of object), it 
might be possible to add another dict specialization for interned string 
lookup. The wins would be that lookup could assume the string's hash is 
valid, and equality comparison could be done via pointer comparison. 
HOWEVER...

I think that the attribute cache patches that went into 2.6 and 3.0 
mostly take care of lookup speed issues. They both assume strings are 
interned already. A cache hit consists of calculating a cache index from 
the string hash, ensuring that the attribute name at that index is 
identical (via pointer comparison) to the attribute name to be looked 
up, and returning the associated value. Lookup with an attribute that's 
not a string or not interned is automatically a cache miss, and it 
happens very rarely.

Specializing dicts for interned strings would optimize the cache miss. 
(When I was making the 3.0 patch, I found that happened rarely on the 
benchmarks and regression tests. It was somewhere around 5%.) The cache 
miss isn't expensive just because of the dict lookup. The attribute has 
to be searched for in every type and super-type of the object. The 
occasional string equality test probably doesn't register.

I'd be happy to be shown to be wrong, though.

Neil
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to