Assaf Mendelson created SPARK-17691:
---------------------------------------

             Summary: Add aggregate function to collect list with maximum 
number of elements
                 Key: SPARK-17691
                 URL: https://issues.apache.org/jira/browse/SPARK-17691
             Project: Spark
          Issue Type: New Feature
            Reporter: Assaf Mendelson
            Priority: Minor


One of the aggregate functions we have today is the collect_list function. This 
is a useful tool to do a "catch all" aggregation which doesn't really fit 
anywhere else.

The problem with collect_list is that it is unbounded. I would like to see a 
means to do a collect_list where we limit the maximum number of elements.

I would see that the input for this would be the maximum number of elements to 
use and the method of choosing (pick whatever, pick the top N, pick the bottom 
B)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to