caiwanli opened a new issue, #44598: URL: https://github.com/apache/arrow/issues/44598
### Describe the usage question you have. Please include as many useful details as possible. My requirement is: Suppose my RecordBatch contains three columns, and I would like to calculate a hash value based on two of those columns. The process is illustrated in the following diagram:  The function interface I designed is as follows: `vector<size_t> HashRB1(std::shared_ptr<arrow::RecordBatch> &input, vector<int> idxs, int type); ` Where ‘input’ is the input data, ‘idxs’ specifies the columns for which the hash needs to be calculated, and ‘type’ indicates the type of hash function. or, It can also be done as shown below, where the hash value is added as a new column to the existing RecordBatch.  Function interface: `shared_ptr<arrow::RecordBatch> HashRB2(std::shared_ptr<arrow::RecordBatch> &input, vector<int> idxs, int type); ` In summary, how should I implement the HashRB1 or HashRB2 function? I checked the Arrow documentation, and it seems there isn't a direct function to compute a hash for a RecordBatch. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
