[GitHub] [arrow-datafusion] Dandandan edited a comment on issue #26: Vectorized hashing for hash aggregation code

GitBox Wed, 21 Apr 2021 11:56:25 -0700


Dandandan edited a comment on issue #26:
URL: https://github.com/apache/arrow-datafusion/issues/26#issuecomment-824283722



   Thanks @jorgecarleitao !
   
   Yeah I would agree, would be great to have a hashing kernel or maybe some 
basic primitives to build one easily using arrow. Something like `hash_hasher` 
looks cool too and is actually very similar to the one used in the hash join 
(hashbrown hashmap + `IdHashBuilder` which just uses the identity function) . 
Looking at the code of `hash_hasher`, it's something similar done as in the 
hash join (IdHashBuilder), but seems that it should be doing a bit more work 
(e.g. the "hash combiner" works over bytes) and `hashbrown` is also slightly 
faster. I believe because of more inlining as the standard library one uses / 
exports the same crate..
   
   For this PR I was thinking to move the code to `hash_utils`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Dandandan edited a comment on issue #26: Vectorized hashing for hash aggregation code

Reply via email to