RE: [External] Re: Sorting in Spark on multiple partitions

2018-06-06 Thread Sing, Jasbir
ra...@gmail.com] Sent: Monday, June 4, 2018 10:59 PM To: Jain, Neha T. mailto:neha.t.j...@accenture.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org>; Patel, Payal mailto:payal.pa...@accenture.com>>; Sing, Jasbir mailto:jasbir.s...@accenture.com>> Subject: Re: [Extern

Re: [External] Re: Sorting in Spark on multiple partitions

2018-06-04 Thread Jörn Franke
s across multiple nodes. > > Thanks & Regards, > Neha Jain > > From: Jörn Franke [mailto:jornfra...@gmail.com] > Sent: Monday, June 4, 2018 10:48 AM > To: Sing, Jasbir > Cc: user@spark.apache.org; Patel, Payal ; Jain, > Neha T. > Subject: [External] Re: Sor

Re: [External] Re: Sorting in Spark on multiple partitions

2018-06-04 Thread Jörn Franke
nodes. > > Thanks & Regards, > Neha Jain > > From: Jörn Franke [mailto:jornfra...@gmail.com] > Sent: Monday, June 4, 2018 10:48 AM > To: Sing, Jasbir > Cc: user@spark.apache.org; Patel, Payal ; Jain, > Neha T. > Subject: [External] Re: Sorting in Sp

Re: Sorting in Spark on multiple partitions

2018-06-03 Thread Jörn Franke
You partition by userid, why do you then sort again by userid in the partition? Can you try to remove userid from the sort? How do you check if the sort is correct or not? What is the underlying objective of the sort? Do you have more information on schema and data? > On 4. Jun 2018, at