Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 )
Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. ...................................................................... Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: if (curr_bucket->filled) { > Good point, I must have miss this while I refine my code. Will change this I just remember that it is possible to swap with an empty bucket, especially in quadratic_probing mode. Consider hash-table with buckets size 4, quadratic probing, and incoming elements 1, 1, 2 in that order. Let say -1 represent empty bucket and hash of an element is equal to that element itself. Initially the buckets look like this: [-1, -1, -1, -1] After 1 and 1 inserted to the table, buckets will look like this [-1, 1, 1, -1] Now, when we want to insert 2 to table, this algorithm will put 2 at the last empty bucket in the probe sequence, that is index 3. [-1, 1, 1, 2] Then, we rebalance by swapping 2 at index 3 with 1 at index 2 [-1, 1, 2, 1] Now, the element 1 at index 3 is misplaced, because if we follow quadratic_probing, the appropriate probe sequence for 1 should be 1, 2, 0 (that is (2 + 2) mod 4), 3, and so on. So we need to swap 1 at index 3 with the empty bucket at index 0. [1, 1, 2, -1] At this point, bucket index 3 is an empty bucket and the rebalancing finish. This is kind of counter intuitive, because most literature show that robin-hood should pair with linear probing. And in linear probing, this case will not happen. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Wed, 25 Mar 2020 01:28:31 +0000 Gerrit-HasComments: Yes