Dandandan edited a comment on issue #26: URL: https://github.com/apache/arrow-datafusion/issues/26#issuecomment-824283722
Thanks @jorgecarleitao ! Yeah I would agree, would be great to have a hashing kernel or maybe some basic primitives to build one easily using arrow. Something like `hash_hasher` looks cool too and is actually very similar to the one used in the hash join (hashbrown hashmap + `IdHashBuilder` which just uses the identity function) . Looking at the code of `hash_hasher`, it's something similar done as in the hash join (IdHashBuilder), but seems that it should be doing a bit more work (e.g. the "hash combiner" works over bytes) and `hashbrown` is also slightly faster. I believe because of more inlining as the standard library one uses / exports the same crate.. For this PR I was thinking to move the code to `hash_utils` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
