[ 
https://issues.apache.org/jira/browse/ARROW-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kietzman updated ARROW-13451:
---------------------------------
    Description: 
Scalar aggregation does not incur large memory overhead for the associated 
KernelState objects, so maybe it'd be acceptable to remove explicit scalar 
aggregation kernels in favor of reusing grouped aggregation kernels with a 
single group. This would decrease our maintenance burden significantly, and if 
the benchmarks don't show a regression for single-group aggregation then 
there's no reason not to.

Even if there is a performance regression we could bundle the scalar and 
grouped aggregate kernels in the same compute::Function and decide between them 
in Dispatch*, rather than confusingly defining distinct "sum" and "hash_sum" 
functions

  was:Scalar aggregation does not incur large memory overhead for the 
associated KernelState objects, so maybe it'd be acceptable to remove explicit 
scalar aggregation kernels in favor of reusing grouped aggregation kernels with 
a single group. This would decrease our maintenance burden significantly, and 
if the benchmarks don't show a regression for single-group aggregation then 
there's no reason not to


> [C++][Compute] Consider removing ScalarAggregateKernel
> ------------------------------------------------------
>
>                 Key: ARROW-13451
>                 URL: https://issues.apache.org/jira/browse/ARROW-13451
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Ben Kietzman
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Scalar aggregation does not incur large memory overhead for the associated 
> KernelState objects, so maybe it'd be acceptable to remove explicit scalar 
> aggregation kernels in favor of reusing grouped aggregation kernels with a 
> single group. This would decrease our maintenance burden significantly, and 
> if the benchmarks don't show a regression for single-group aggregation then 
> there's no reason not to.
> Even if there is a performance regression we could bundle the scalar and 
> grouped aggregate kernels in the same compute::Function and decide between 
> them in Dispatch*, rather than confusingly defining distinct "sum" and 
> "hash_sum" functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to