[Tim]
>> - I don't have a theory for why dict build time is _so_ much higher
>> than dict lookup time for the nasty keys.

To be clearer, in context this was meant to be _compared to_ the
situation for sets.  These were the numbers:

11184810 nasty keys
dict build           23.32
dict lookup          13.26
set build             9.79
set lookup            8.46

Build time is higher than lookup time for both, which isn't
surprising.  But it's _much_ higher (whether in absolute or relative
terms) for dicts than for sets.

[Serhiy Storchaka <storch...@gmail.com>]
> 1/2 of items were copied to a new hashtable when the dict grew up. 1/4
> of items were copied twice. 1/8 of items were copied three times, etc.
> In average every item is copied 1 time, so the build time is twice the
> lookup time when the cost of lookup is dominated.

I agree the average total number of table insertions per element
approaches 2 either way (counting insertions due to internal table
growth as well as insertions visible from Python code).  That's why
it's expected that build time exceeds lookup time (the latter does
exactly one lookup per element).


> But the lookup benchmarking includes the overhead of iterating Python
> loop, which is more significant for "good" keys, thus the lookup time is
> larger than the build time.

Iteration time is too minor to care much about.  Even in the worst
case (sets with the nasty keys), the Python:

    for k in d:
        pass

takes under a second here.  Take a full second off the "set lookup"
time, and dict build time still exceeds dict lookup time (whether in
absolute or relative terms) much more so than for sets.

But I'm not going to pursue it.  The high-order bit was just to
demonstrate that the set object's more complicated (than dicts) probe
sequence does buy highly significant speedups - in at least one highly
contrived case.  Since the probe sequence changes were aimed at
reducing cache misses, a full account requires comparing cache misses
between the dict and set implementations.  That's a mess, and slogging
through it appears unlikey to result in insight.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IYPHUKW7DGYNAFAC4AOCDYYUWANHN5YE/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to