Tim Peters wrote: > RE [Richard Higginbotham higgi...@gmail.com] ...... > Traversing via a set or dict instead reads the objects' data in > pseudo-random memory order instead, which can give a very high rate of > cache misses. In addition, the internal hash table probes also occur > in pseudo-random order, adding yet another layer of cache misses. > Plain old lists are in fact pleasant in many ways :-)
Very informative, thank you. I tried different tables sizes and saw that you are correct that the table size doesn't matter. I assumed it was collisions that were adding the extra time when I increased the size of A and B. You are probably right that cache misses are a/the factor. In theory for algebraic set operations the size of the left hand argument (that's having its values "compared" to the other set) is the limiter. When I saw that time increase beyond the size increase of A, I thought it must be the hash functions but I didn't really want to say that not knowing the internals. Actually now I wish it was as it would easier to account for in judging my algorithm :-). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ANGHWMIWUJZWOWC37EPL3O4ICA5UZPK5/ Code of Conduct: http://python.org/psf/codeofconduct/