[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread INADA Naoki
INADA Naoki added the comment: It seems not "simplify". Please try implement it and show benchmark result if you think it's really worth enough. -- ___ Python tracker ___ __

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: Tim, I am not testing randomly. The scripts calculate secondary hash for **each** value in a ring to see how often this results in duplicate (or triplicate, etc.) values. And they do it for each (mod 2**i) ring for i between 8 and 20. Then (mostly as an

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Tim Peters
Tim Peters added the comment: Whatever the iteration, accept that it's necessary it reach every value in range(2**i) eventually. The current scheme does that, for reasons already documented. You seem to be overlooking the importance of this part: """ Note that because perturb is unsigned, if

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: Yes, I do understand it. But the statement in lines 166, 167: "For any initial j in range(2**i), repeating that 2**i times generates each int in range(2**i) exactly once" does not hold when the perturb is added. 2**i times will not be enough to generate

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread INADA Naoki
INADA Naoki added the comment: Firstly, do you understand the lookup without perturb? It's documented here: https://github.com/python/cpython/blob/258bfc462b1e58689b43f662a10e44ece3a10bef/Objects/dictobject.c#L161-L182 >>> n = 0; L = [] >>> for i in range(16): ... L.append(n) ... n = (n*5+1)

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: I am looking at the 3.6.1 code. I am not trying to simulate the index lookup as the lookup function would do it. There is no point in that. The point of the script is to detect collisions among perturbed secondary indexes rather than between primary has

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread INADA Naoki
INADA Naoki added the comment: You may misunderstood current lookup loop. In your collisions_count.py, `yield i` must be added right before `while True:` -- ___ Python tracker _

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
New submission from Dmitry Rubanovich: lookdict_index() (and the rest of the files in dictobject.c) are using unnecessarily complicated perturb mechanism. And, in fact, it's slower than the simpler case. Instead of this: for (size_t perturb = hash;;) { perturb >>= PERTURB_SHIFT; i =