[ https://issues.apache.org/jira/browse/ARROW-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ben Kietzman updated ARROW-13451: --------------------------------- Description: Scalar aggregation does not incur large memory overhead for the associated KernelState objects, so maybe it'd be acceptable to remove explicit scalar aggregation kernels in favor of reusing grouped aggregation kernels with a single group. This would decrease our maintenance burden significantly, and if the benchmarks don't show a regression for single-group aggregation then there's no reason not to. Even if there is a performance regression we could bundle the scalar and grouped aggregate kernels in the same compute::Function and decide between them in Dispatch*, rather than confusingly defining distinct "sum" and "hash_sum" functions was:Scalar aggregation does not incur large memory overhead for the associated KernelState objects, so maybe it'd be acceptable to remove explicit scalar aggregation kernels in favor of reusing grouped aggregation kernels with a single group. This would decrease our maintenance burden significantly, and if the benchmarks don't show a regression for single-group aggregation then there's no reason not to > [C++][Compute] Consider removing ScalarAggregateKernel > ------------------------------------------------------ > > Key: ARROW-13451 > URL: https://issues.apache.org/jira/browse/ARROW-13451 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Ben Kietzman > Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Scalar aggregation does not incur large memory overhead for the associated > KernelState objects, so maybe it'd be acceptable to remove explicit scalar > aggregation kernels in favor of reusing grouped aggregation kernels with a > single group. This would decrease our maintenance burden significantly, and > if the benchmarks don't show a regression for single-group aggregation then > there's no reason not to. > Even if there is a performance regression we could bundle the scalar and > grouped aggregate kernels in the same compute::Function and decide between > them in Dispatch*, rather than confusingly defining distinct "sum" and > "hash_sum" functions -- This message was sent by Atlassian Jira (v8.3.4#803005)