Re: Is coalesce smart while merging partitions?

2015-10-08 Thread Iulian DragoČ™
It's smart. Have a look at https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala#L123 On Thu, Oct 8, 2015 at 4:00 AM, Cesar Flores wrote: > It is my understanding that the default behavior of coalesce function when > the user

Re: Is coalesce smart while merging partitions?

2015-10-08 Thread Daniel Darabos
> For example does spark try to merge the small partitions first or the election of partitions to merge is random? It is quite smart as Iulian has pointed out. But it does not try to merge small partitions first. Spark doesn't know the size of partitions. (The partitions are represented as

Is coalesce smart while merging partitions?

2015-10-07 Thread Cesar Flores
It is my understanding that the default behavior of coalesce function when the user reduce the number of partitions is to only merge them without executing shuffle. My question is: Is this merging smart? For example does spark try to merge the small partitions first or the election of partitions