[ 
https://issues.apache.org/jira/browse/SPARK-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647445#comment-15647445
 ] 

Jayadevan M commented on SPARK-17691:
-------------------------------------

I am interested to take look into this.

> Add aggregate function to collect list with maximum number of elements
> ----------------------------------------------------------------------
>
>                 Key: SPARK-17691
>                 URL: https://issues.apache.org/jira/browse/SPARK-17691
>             Project: Spark
>          Issue Type: New Feature
>            Reporter: Assaf Mendelson
>            Priority: Minor
>
> One of the aggregate functions we have today is the collect_list function. 
> This is a useful tool to do a "catch all" aggregation which doesn't really 
> fit anywhere else.
> The problem with collect_list is that it is unbounded. I would like to see a 
> means to do a collect_list where we limit the maximum number of elements.
> I would see that the input for this would be the maximum number of elements 
> to use and the method of choosing (pick whatever, pick the top N, pick the 
> bottom B)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to