[ https://issues.apache.org/jira/browse/ARROW-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-5002: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/21501 > [C++] Implement Hash Aggregation query execution node > ----------------------------------------------------- > > Key: ARROW-5002 > URL: https://issues.apache.org/jira/browse/ARROW-5002 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Philipp Moritz > Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Dear all, > I wonder what the best way forward is for implementing GroupBy kernels. > Initially this was part of > https://issues.apache.org/jira/browse/ARROW-4124 > but is not contained in the current implementation as far as I can tell. > It seems that the part of group by that just returns indices could be > conveniently implemented with the HashKernel. That seems useful in any case. > Is that indeed the best way forward/should this be done? > GroupBy + Aggregate could then either be implemented with that + the Take > kernel + aggregation involving more memory copies than necessary though or as > part of the aggregate kernel. Probably the latter is preferred, any thoughts > on that? > Am I missing any other JIRAs related to this? > Best, Philipp. -- This message was sent by Atlassian Jira (v8.20.10#820010)