Thanks for the quick reply and explanation @rxin. So if one does not want to collect()/take() but want the top k as a dataset to do further transformations there is no optimized API, that's why I am suggesting adding this "top()" as a public method.
If that sounds like a good idea, I will open a ticket and implement it. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org