An ORDER BY needs to be on the outermost query otherwise subsequent operations (such as the join) could reorder the tuples.
On Mon, Jul 20, 2015 at 9:25 AM, Carol McDonald <cmcdon...@maprtech.com> wrote: > the following query on the Movielens dataset , is sorting by the count of > ratings for a movie. It looks like the results are ordered by partition > ? > scala> val results =sqlContext.sql("select movies.title, movierates.maxr, > movierates.minr, movierates.cntu from(SELECT ratings.product, > max(ratings.rating) as maxr, min(ratings.rating) as minr,count(distinct > user) as cntu FROM ratings group by ratings.product order by cntu desc) > movierates join movies on movierates.product=movies.movieId ") > > scala> results.take(30).foreach(println) > [Right Stuff, The (1983),5.0,1.0,750] > [Lost in Space (1998),5.0,1.0,667] > [Dumb & Dumber (1994),5.0,1.0,660] > [Patch Adams (1998),5.0,1.0,474] > [Carlito's Way (1993),5.0,1.0,369] > [Rounders (1998),5.0,1.0,345] > [Bedknobs and Broomsticks (1971),5.0,1.0,319] > [Beverly Hills Ninja (1997),5.0,1.0,232] > [Saving Grace (2000),5.0,1.0,186] > [Dangerous Minds (1995),5.0,1.0,141] > [Death Wish II (1982),5.0,1.0,85] > [All Dogs Go to Heaven 2 (1996),5.0,1.0,75] > [Repossessed (1990),4.0,1.0,53] > [Assignment, The (1997),5.0,1.0,49] > [$1,000,000 Duck (1971),5.0,1.0,37] > [Stonewall (1995),5.0,1.0,20] > [Dog of Flanders, A (1999),5.0,1.0,8] > [Frogs for Snakes (1998),3.0,1.0,5] > [It's in the Water (1998),3.0,2.0,3] > [Twelve Monkeys (1995),5.0,1.0,1511] > [Ransom (1996),5.0,1.0,564] > [Alice in Wonderland (1951),5.0,1.0,525] > [City Slickers II: The Legend of Curly's Gold (1994),5.0,1.0,392] > [Eat Drink Man Woman (1994),5.0,1.0,346] > [Cube (1997),5.0,1.0,233] > [Omega Man, The (1971),5.0,1.0,224] > [Stepmom (1998),5.0,1.0,146] > [Metro (1997),5.0,1.0,100] > [Death Wish 3 (1985),5.0,1.0,72] > [Stalker (1979),5.0,1.0,52] >