Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15511


Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
......................................................................

WIP IMPALA-9434: Implement Robin Hood Hash Table.

Robin hood hashing reduces the variances of probe lengths by
continually rebalancing elements. This commit implement robin hood
hashing behavior into exec/hash-table.h with small modification to the
robin hood algorithm to keep probe function of hash-table intact.

Instead of proactively swapping elements during insertion, the
algorithm does the rebalancing by first letting the new element probe
an empty bucket and claim it through PrepareBucketForInsert(). After
the bucket at index curr_idx claimed, the rebalancing algorithm in
function Rebalance() walk through the probe sequence again to find
richer or empty bucket to swap with bucket[curr_idx]. The search and
swapping continue until it swap with empty bucket or does not find any
richer bucket along the probe sequence. It effectively treat
bucket[curr_idx] as temporary variable during the rebalancing process.
Rebalance() return the new index of the recently claimed bucket after
rebalancing process done.

The decision to not swap elements proactively is because the probe
function is used for both insertion and lookup-only. Keeping probing
routine decoupled from insertion routine allow us to invoke rebalance
only after bucket claim, not on lookup-only. Insert into an already
filed bucket (duplicates), or insert using hash-table iterator does
not trigger rebalance.

Testing:
- Pass core tests

Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
---
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
2 files changed, 94 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15511/3
--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>

Reply via email to