New submission from INADA Naoki: Current perturb shift code is like following:
for (perturb = hash; ; perturb >>= PERTURB_SHIFT) { i = mask & ((i << 2) + i + perturb + 1); This loop is start after first conflict. It means perturb == hash for first conflict. The purpose of perturb shift is avoid long conflict chain when keys more than two have hashes their lower-bit is same. So first perturb should be hash >> PERTURB_SHIFT. example: Consider about ma_keys == 16 and keys are [1, 17, 33, 49, 65]. Current perturb 1. hash(1) & (16-1) == 1; 1 uses ix==1 slot. 2. hash(17) & (16-1) == 1; ix==1 conflicts; Next ix is mask & (3 + 17 + 1) == 5; use ix==5 slot. 3. hash(33) & (16-1) == 1; ix==1 conflicts; Next ix is mask & (3 + 33 + 1) == 5; ix==5 conflicts; ... When first perturb = hash >> PERTURB_SHIFT: 1. hash(1) & (16-1) == 1; 1 uses ix==1 slot. 2. hash(17) & (16-1) == 1; ix==1 conflicts; Next ix is mask & (3 + (17>>5) + 1) == 4; use ix==4 slot. 3. hash(33) & (16-1) == 1; ix==1 conflicts; Next ix is mask & (3 + (33>>5) + 1) == 5; use ix==5 slot. While it's difficult to see performance difference from benchmark, this should be decrease possibility of 2nd conflict. ---------- components: Interpreter Core messages: 276936 nosy: methane priority: normal severity: normal status: open title: dict: perturb shift should be done when first conflict versions: Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28201> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com