David Rorke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15511 )

Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14
PS3, Line 14: Instead of proactively swapping elements during insertion, the
> Having separate Probe() function that is special for insert might remove th
Yes, my intuition is also that doing the additional work to create a separate 
single pass probe for insert and also making sure we maintain the distance 
invariant in all cases are both worthwhile.


http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h
File be/src/exec/hash-table.inline.h:

http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162
PS3, Line 162: inline int64_t HashTable::BucketDistance(
> I'll see if I can put O(1) distance measurement, at least for the linear pr
Wonder if we should consider storing the bucket distance (calculated during 
insert/rebalance) in the bucket itself.  This might work against the goals of 
IMPALA-7635 but with a carefully packed memory layout maybe we could still 
squeeze in a byte sized value for the distance.  Seems intuitively like a byte 
should be enough to store most distances, and could fall back to calculating 
the distance via probe in rare cases where the distance won't fit in a byte.

Of course if you can figure out a good enough O(1) approach that doesn't 
require any storage that might be simpler.

Regarding linear vs quadratic I'd also hope that robin-hood doesn't require 
quadratic.  Maybe we should just do some benchmarking and see if there's a 
clear winner between different probing algorithms and if so go with that.



--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Mar 2020 01:36:37 +0000
Gerrit-HasComments: Yes

Reply via email to