You can override the default partitioner with range
partitioner<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/Partitioner.scala#L92>which
distributes data in roughly equal sized partitions.


On Thu, May 1, 2014 at 11:14 PM, deenar.toraskar <deenar.toras...@db.com>wrote:

> Yes
>
> On a job I am currently running, 99% of the partitions finish within
> seconds
> and a couple of partitions take around and hour to finish. I am pricing
> some
> instruments and complex instruments take far longer to price than plain
> vanilla ones. If I could distribute these complex instruments evenly, the
> overall job times would greatly reduce.
>
>
> Deenar
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Equally-weighted-partitions-in-Spark-tp5171p5208.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to