Spark SQL sort by and collect by in multiple partitions

2015-09-02 Thread Niranda Perera
Hi all, I have been using sort by and order by in spark sql and I observed the following when using SORT BY and collect results, the results are getting sorted partition by partition. example: if we have 1, 2, ... , 12 and 4 partitions and I want to sort it in descending order, partition 0 (p0) w

Re: Spark SQL sort by and collect by in multiple partitions

2015-09-02 Thread Vishnu Kumar
Hi, Yes this is intended behavior. "ORDER BY" guarantees the total order in output while "SORT BY" guarantees the order within a partition. Vishnu On Thu, Sep 3, 2015 at 10:49 AM, Niranda Perera wrote: > Hi all, > > I have been using sort by and order by in spark sql and I observed the > fol