[ 
https://issues.apache.org/jira/browse/ARROW-15152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494761#comment-17494761
 ] 

Dhruv Vats edited comment on ARROW-15152 at 2/18/22, 5:47 PM:
--------------------------------------------------------------

Is it something like:
||Arg||Key||
|Yo|1|
|Hello|2|
|Howdy|2|
|Wassup|1|
|Yolo|3|

When {{hash_list}} is applied as aggregate to group by {{{}Key{}}}, the above 
table becomes:
||hash_list(Arg)||Key||
|[Yo, Wassup]|1|
|[Hello, Howdy]|2|
|[Yolo]|3|

?

 


was (Author: dhruv9vats):
Is it something like:

 
||Arg||Key||
|Yo|1|
|Hello|2|
|Howdy|2|
|Wassup|1|
|Yolo|3|

When {{hash_list}} is applied as aggregate to group by {{Key, }}the above{{ 
table}} becomes:

 

 
||hash_list(Arg)||Key||
|[Yo, Wassup]|1|
|[Hello, Howdy]|2|
|[Yolo]|3|

?

 

> Implement a `hash_list` aggregate function.
> -------------------------------------------
>
>                 Key: ARROW-15152
>                 URL: https://issues.apache.org/jira/browse/ARROW-15152
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Python
>    Affects Versions: 6.0.1
>            Reporter: A. Coady
>            Assignee: Dhruv Vats
>            Priority: Major
>              Labels: good-second-issue, kernel
>
> For more advanced aggregations, it's helpful to be able to gather the grouped 
> values into a list array. Pandas and Polars both have that feature. And 
> `hash_distinct` already aggregates to lists, so all the building blocks are 
> there.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to