David Rorke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 )
Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the > Having separate Probe() function that is special for insert might remove th Yes, my intuition is also that doing the additional work to create a separate single pass probe for insert and also making sure we maintain the distance invariant in all cases are both worthwhile. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162 PS3, Line 162: inline int64_t HashTable::BucketDistance( > I'll see if I can put O(1) distance measurement, at least for the linear pr Wonder if we should consider storing the bucket distance (calculated during insert/rebalance) in the bucket itself. This might work against the goals of IMPALA-7635 but with a carefully packed memory layout maybe we could still squeeze in a byte sized value for the distance. Seems intuitively like a byte should be enough to store most distances, and could fall back to calculating the distance via probe in rare cases where the distance won't fit in a byte. Of course if you can figure out a good enough O(1) approach that doesn't require any storage that might be simpler. Regarding linear vs quadratic I'd also hope that robin-hood doesn't require quadratic. Maybe we should just do some benchmarking and see if there's a clear winner between different probing algorithms and if so go with that. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Wed, 25 Mar 2020 01:36:37 +0000 Gerrit-HasComments: Yes