Do you have a particular concern? You’re always using a partitioner (default is HashPartitioner) and the Partitioner interface is pretty light, can’t see how it could affect performance.
Used correctly it should improve performance as you can better control placement of data and avoid shuffling… -adrian From: swetha kasireddy Date: Monday, October 26, 2015 at 6:56 AM To: Adrian Tanase Cc: Bill Bejeck, "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Re: Secondary Sorting in Spark Hi, Does the use of custom partitioner in Streaming affect performance? On Mon, Oct 5, 2015 at 1:06 PM, Adrian Tanase <atan...@adobe.com<mailto:atan...@adobe.com>> wrote: Great article, especially the use of a custom partitioner. Also, sorting by multiple fields by creating a tuple out of them is an awesome, easy to miss, Scala feature. Sent from my iPhone On 04 Oct 2015, at 21:41, Bill Bejeck <bbej...@gmail.com<mailto:bbej...@gmail.com>> wrote: I've written blog post on secondary sorting in Spark and I'd thought I'd share it with the group http://codingjunkie.net/spark-secondary-sort/ Thanks, Bill