Weston Pace created ARROW-16549: ----------------------------------- Summary: [C++] Simplify AggregateNodeOptions aggregates/targets Key: ARROW-16549 URL: https://issues.apache.org/jira/browse/ARROW-16549 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Weston Pace
Currently AggregateNodeOptions is: {noformat} class ARROW_EXPORT AggregateNodeOptions : public ExecNodeOptions { public: // aggregations which will be applied to the targetted fields std::vector<internal::Aggregate> aggregates; // fields to which aggregations will be applied std::vector<FieldRef> targets; // output field names for aggregations std::vector<std::string> names; // keys by which aggregations will be grouped std::vector<FieldRef> keys; }; {noformat} It is not very obvious how {{aggregates}} and {{targets}} are related. My initial read of the comments led me to think that each aggregate would be applied to each target and you would end up with {{len(aggregates) * len(targets)}} output fields. In reality the {{aggregate}} at index {{i}} only applies to the {{target}} at index {{i}}. It would be simpler to add a {{FieldRef target}} to {{internal::Aggregate}} (and {{Aggregate}} should not be {{internal}}). Alternatively, the entire {{internal::Aggregate}} could be replaced by a "call" {{arrow::compute::Expression}} -- This message was sent by Atlassian Jira (v8.20.7#820007)