> For example does spark try to merge the small partitions first or the
election of partitions to merge is random?
It is quite smart as Iulian has pointed out. But it does not try to merge
small partitions first. Spark doesn't know the size of partitions. (The
partitions are represented as Iterato
It's smart. Have a look at
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala#L123
On Thu, Oct 8, 2015 at 4:00 AM, Cesar Flores wrote:
> It is my understanding that the default behavior of coalesce function when
> the user reduce the number of
It is my understanding that the default behavior of coalesce function when
the user reduce the number of partitions is to only merge them without
executing shuffle.
My question is: Is this merging smart? For example does spark try to merge
the small partitions first or the election of partitions t