Weston Pace created ARROW-16513:
-----------------------------------

             Summary: [C++] Add a compute function to hash inputs
                 Key: ARROW-16513
                 URL: https://issues.apache.org/jira/browse/ARROW-16513
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Weston Pace


We have a lot of internal logic for hashing inputs and it might be nice to 
expose some of this to users (e.g. 
https://stackoverflow.com/questions/72177022/how-to-get-hash-of-string-column-in-polars-or-pyarrow)

The `HashBatch` method in `key_hash.h` (not quite merged but close) is likely 
to be the most performant.  However, it does make some sacrifices on uniqueness 
of hashes in the spirit of performance (so we should make sure to document 
these).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to