> For example does spark try to merge the small partitions first or the election of partitions to merge is random?
It is quite smart as Iulian has pointed out. But it does not try to merge small partitions first. Spark doesn't know the size of partitions. (The partitions are represented as Iterators. You cannot know its size without destroying it.)