Mark Dickinson added the comment:
For what it's worth, here are timings on my machine showing the overhead of the
extra equality check when a hash collision occurs.
Python 2.7.11 (default, Mar 1 2016, 18:08:21)
Type "copyright", "credits" or "license" for more information.
IPython 4.2.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: from decimal import Decimal
In [2]: set1 = set([Decimal(str(n/1000.0)) for n in range(1, 10)] +
[Decimal(str(n/100.0)) for n in range(1, 10)])
In [3]: set2 = set([Decimal(str(n/1000.0)) for n in range(2, 20)])
In [4]: print len(set1), len(set2) # Both sets have the same length
18 18
In [5]: print len(set(map(hash, set1))), len(set(map(hash, set2))) # But set1
has hash collisions
9 18
In [6]: %timeit Decimal('0.005') in set1 # checking elt in the set, first
match is the right one
The slowest run took 5.98 times longer than the fastest. This could mean that
an intermediate result is being cached.
100000 loops, best of 3: 17.4 µs per loop
In [7]: %timeit Decimal('0.05') in set1 # checking elt in the set, collision
resolution needed
The slowest run took 5.72 times longer than the fastest. This could mean that
an intermediate result is being cached.
100000 loops, best of 3: 19.6 µs per loop
In [8]: %timeit Decimal('0.005') in set2 # should be similar to the first set1
result
The slowest run took 5.99 times longer than the fastest. This could mean that
an intermediate result is being cached.
100000 loops, best of 3: 17.3 µs per loop
----------
status: pending -> open
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue27265>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com