Weston Pace created ARROW-16513: ----------------------------------- Summary: [C++] Add a compute function to hash inputs Key: ARROW-16513 URL: https://issues.apache.org/jira/browse/ARROW-16513 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Weston Pace
We have a lot of internal logic for hashing inputs and it might be nice to expose some of this to users (e.g. https://stackoverflow.com/questions/72177022/how-to-get-hash-of-string-column-in-polars-or-pyarrow) The `HashBatch` method in `key_hash.h` (not quite merged but close) is likely to be the most performant. However, it does make some sacrifices on uniqueness of hashes in the spirit of performance (so we should make sure to document these). -- This message was sent by Atlassian Jira (v8.20.7#820007)