Okay from looking closer at some of the code, I'm not sure that what I'm
asking for in terms of adaptive execution makes much sense as it can only
happen between stages. I.e. optimising future /stages/ based on the results
of previous stages. Thus an "on-demand" adaptive coalesce doesn't make much
sense as it wouldn't necessarily occur at a stage boundary.

However I think my original question still stands of:
- How to /dynamically/ deal with poorly partitioned data without incurring a
shuffle or extra computation.

I think the only thing that's changed is that I no longer have any good
ideas on how to do it :/



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to