Hi, I am completely agree with the use of dataframe for most operations using SPARK, unless you are custom algorithm or algorithms that need use of RDD. Databricks have taken a cue from Apache Flink (I think) and rewritten tungsten as the base engine that drives dataframe, so there is performance optimization.
Regards, Gourav Sengupta On Fri, Mar 4, 2016 at 8:35 AM, Mohammad Tariq <donta...@gmail.com> wrote: > You could try DataFrame.sort() to sort your data based on a column. > > > > [image: http://] > > Tariq, Mohammad > about.me/mti > [image: http://] > <http://about.me/mti> > > > On Fri, Mar 4, 2016 at 1:48 PM, Angel Angel <areyouange...@gmail.com> > wrote: > >> hello sir, >> >> i want to sort the following table as per the *count* >> >> value count >> 52639 22 >> 75243 4 >> 13 55 >> 56 5 >> 185463 45 >> 324364 32 >> >> >> So first i convert the my dataframe to to rdd to sort the table. >> >> val k = table.rdd >> >> convert the rdd array into key value pair. >> >> val s =k.take(6) >> >> val rdd = s.map(x=> x(1),(x(0)). >> rdd.sortByKey >> >> >> >> this is my all operations i did to sort the table. >> >> Please can you suggest me the better way to sort the table >> > >