Wes McKinney created ARROW-3978: ----------------------------------- Summary: [C++] Implement hashing, dictionary-encoding for StructArray Key: ARROW-3978 URL: https://issues.apache.org/jira/browse/ARROW-3978 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Wes McKinney Fix For: 0.13.0
This is a central requirement for hash-aggregations such as {code} SELECT AGG_FUNCTION(expr) FROM table GROUP BY expr1, expr2, ... {code} The materialized keys in the GROUP BY section form a struct, which can be incrementally hashed to produce dictionary codes suitable for computing aggregates or any other purpose. There are a few subtasks related to this, such as efficiently constructing a record (that can be hashed quickly) to identify each "row" in the struct. Maybe we should start with that first -- This message was sent by Atlassian JIRA (v7.6.3#76005)