Brandt Bucher <brandtbuc...@gmail.com> added the comment:

> If I understand your proposal, that would mean that calling a function with a 
> N-case constant-pattern match would require N hashes and N insertions into a 
> dict (including memory allocation), followed by O(1) lookup. Wouldn't that 
> eliminate any hope of making such a function O(1)? Unless there's a way to 
> cache the populated dict somewhere else?

Yeah, that was the idea sketched out above. Basically, if we go with a vanilla 
dictionary here, we're going to need to build it at runtime. It only really 
makes sense to do that once, the first time we need it. Then we stash it 
away... uh... somewhere. As you note, it doesn't make much sense to spend 
linear time building a constant-time jump table if it's only going to be used 
once. :)

Maybe somebody familiar with the constantly-evolving opcaches could chime in if 
this is the sort of thing that could benefit from that infrastructure. 
Basically, we would need a separate cache per bytecode offset, per code object. 
My understanding is that we don't have anything like that yet.

(A quick-and-dirty prototype could probably store them at "hidden" local names 
like ".table0", ".table1", ".table2", etc. I know comprehensions do something 
like that.)

> Re-implementing all of the dict logic seems like an unnecessary pain, which 
> is why I was suggesting either the HAMT or a thin wrapper around dict, not a 
> re-implementation of a new hash table.

Well, I don't think we'd need to do any of that. I believe set and frozenset 
share the same core design and routines, but differ only in the interfaces 
provided by the objects themselves. I imagine we could do something similar 
with a hypothetical _PyFrozenDict_Type... copy most of the type definition, 
dropping all of the methods except mp_subcript, tp_hash, and a few other 
members. That would probably be all we needed to get this design up and running 
for a proof-of-concept.

A lot of work goes into making dicts fast (especially for things like strings), 
so it would be nice for a new type to be able to benefit from those incremental 
improvements.

> I was hoping to do the minimum amount of disruption possible to get 
> reasonable O(1) performance, and then seeing where that leaves us.

Agreed, but the HAMT mapping has logarithmic time complexity for lookups, 
right? Maybe for realistic cases the coefficients make it basically equivalent, 
but on the surface it seems more promising to try reusing the dict guts instead.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue44283>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to