I think you are interested in secondary sort, which is still being worked
on:
https://issues.apache.org/jira/browse/SPARK-3655
On Tue, Feb 3, 2015 at 4:41 PM, Nitin kak nitinkak...@gmail.com wrote:
I thought thats what sort based shuffled did, sort the keys going to the
same partition.
I
Nitin,
Suing Spark is not going to help. Perhaps you should sue someone else :-) Just
kidding!
Mohammed
-Original Message-
From: nitinkak001 [mailto:nitinkak...@gmail.com]
Sent: Tuesday, February 3, 2015 1:57 PM
To: user@spark.apache.org
Subject: Re: Sort based shuffle not working
Hm, I don't think the sort partitioner is going to cause the result to
be ordered by c1,c2 if you only partitioned on c1. I mean, it's not
even guaranteed that the type of c2 has an ordering, right?
On Tue, Feb 3, 2015 at 3:38 PM, nitinkak001 nitinkak...@gmail.com wrote:
I am trying to implement
This is an exerpt from the Design document of the implementation of Sort
based shuffle.. I am thinking I might be wrong in my perception of sort
based shuffle. Dont completely understand it though.
*Motivation*
A sortbased shuffle can be more scalable than Spark’s current hashbased
one because
I thought thats what sort based shuffled did, sort the keys going to the
same partition.
I have tried (c1, c2) as (Int, Int) tuple as well. I don't think that
ordering of c2 type is the problem here.
On Tue, Feb 3, 2015 at 5:21 PM, Sean Owen so...@cloudera.com wrote:
Hm, I don't think the sort
Just to add, I am suing Spark 1.1.0
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Sort-based-shuffle-not-working-properly-tp21487p21488.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.