Hi Matt, I have encountered the same issue several times so I totally agree with you that it would be a useful addition to Spark. I frequently solve the unbalance by coding a custom partitioner which is far from ideal, since then I get down to RDDs. I don't know the Spark code base well enough to judge the complexity or how feasible it is though.
Bests, Pol Santamaria On Tue, Mar 30, 2021 at 1:30 PM mhawes <hawes.i...@gmail.com> wrote: > Hi all, Sending this first before creating a jira issue in an effort to > start > a discussion :) > > Problem: > > We have a situation where we end with a very large number (O(10K)) of > partitions, with very little data in most partitions but a lot of data in > some of them. This not only causes slow execution but we run into issues > like SPARK-12837 <https://issues.apache.org/jira/browse/SPARK-12837> . > Naturally the solution here is to coalesce these partitions, however it's > impossible for us to know upfront the size and number of partitions we're > going to get. Additionally this can change on each run. If we naively try > coalescing to say 50 partitions, we're likely to end up with very poorly > distributed data and potentially executor OOMs. This is because the > coalesce > tries to balance number of partitions rather than partition size. > > Proposed Solution > > Therefore I was thinking about the possibility of contributing a feature > whereby a user could specify a target partition size and spark would aim to > get as close as possible to that. Something along these lines has been > partially attempted in SPARK-14042 > <https://issues.apache.org/jira/browse/SPARK-14042> but this falls short > as there is no generic way (AFAIK) of getting access to the partition sizes > before the query execution. Thus I would suggest that we try do something > similar to AQEs "Post Shuffle Partition Coalescing" whereby we allow the > user to trigger an adaptive coalesce even without a shuffle? The optimum > partitions would be calculated dynamically at runtime. This way we dont pay > the price of shuffling but can still get reasonable partition sizes. > > What do people think? Happy to clarify things upon request. :) > > Best, > Matt > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >