On Sep 19, 2019, at 20:25, Tim Peters <tim.pet...@gmail.com> wrote:
> 
> Something I haven't seen mentioned here, but I may have missed it:
> when timing algorithms with sets and dicts, people often end up merely
> measuring the bad effects of hardware data cache misses when the
> containers get large, and especially in contrived benchmarks.

Another issue: the container itself is larger.

A list’s array uses one pointer (8 bytes) per element. Even without looking at 
the implementation of set, a hash bucket obviously has to be at least twice 
that size. Plus, lists need less slack, and put their slack all at the end 
where you don’t need to traverse, while hash tables have their slack 
distributed within the table where it does get in the way. So, a list should be 
able to go at least 2.2x as big as a set before hitting the threshold where 
it’s no longer all in cache.

But of course this is only true if you can forget about the objects themselves. 
Each string object is a lot more than 16 bytes, and the character strings 
themselves are as well in most tests, so the top-level difference is often 
swamped.

Still, the nonlinear jumps when you cross boundaries like “everything in 
cache”, “at least the top level all in cache”, etc. (and at each level of 
cache) might be visible distortions (or visible reflections of a real benefit, 
if your application’s behavior is close enough to the artificial benchmark). 
You can test the behavior of list here by using a huge list of numbers from 1 
to 100… but you can’t do that with set, because that would be a tiny set of 100 
elements.

And of course even once you know the exact characteristics of the cache 
behavior with your real data, that can be hard to optimize for beyond “try to 
keep adjacent things adjacent”. There’s a paper from the GHC guys many years 
ago about how they cleverly tuned the ropes (linked lists of mid-sized arrays) 
they used to optimize large string internals around the L2 cache and found that 
the code that worked great on their older Intel CPU was a huge pessimization on 
the next generation of CPUs because they were fooling the computer into 
thinking they were never going to read past the end of the current page so it 
didn’t prefetch the next one.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/A7ARNTLGQFVAQAKPUDYB2NSCCYJ5665G/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to