You can use `Dataset.limit`, which return a new `Dataset` instead of an Array. Then you can transform it and still get the top k optimization from Spark.
On Wed, Jan 31, 2018 at 3:39 PM, Yacine Mazari <y.maz...@gmail.com> wrote: > Thanks for the quick reply and explanation @rxin. > > So if one does not want to collect()/take() but want the top k as a dataset > to do further transformations there is no optimized API, that's why I am > suggesting adding this "top()" as a public method. > > If that sounds like a good idea, I will open a ticket and implement it. > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >