Do you have a particular concern? You’re always using a partitioner (default is 
HashPartitioner) and the Partitioner interface is pretty light, can’t see how 
it could affect performance.

Used correctly it should improve performance as you can better control 
placement of data and avoid shuffling…

-adrian

From: swetha kasireddy
Date: Monday, October 26, 2015 at 6:56 AM
To: Adrian Tanase
Cc: Bill Bejeck, "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Re: Secondary Sorting in Spark

Hi,

Does the use of custom partitioner in Streaming affect performance?

On Mon, Oct 5, 2015 at 1:06 PM, Adrian Tanase 
<atan...@adobe.com<mailto:atan...@adobe.com>> wrote:
Great article, especially the use of a custom partitioner.

Also, sorting by multiple fields by creating a tuple out of them is an awesome, 
easy to miss, Scala feature.

Sent from my iPhone

On 04 Oct 2015, at 21:41, Bill Bejeck 
<bbej...@gmail.com<mailto:bbej...@gmail.com>> wrote:

I've written blog post on secondary sorting in Spark and I'd thought I'd share 
it with the group

http://codingjunkie.net/spark-secondary-sort/

Thanks,
Bill

Reply via email to