EnricoMi opened a new pull request, #51201:
URL: https://github.com/apache/spark/pull/51201

   ### What changes were proposed in this pull request?
   Allow to disable shuffle data migration to other executors thus only migrate 
shuffle data to fallback storage.
   
   ### Why are the changes needed?
   Currently, even though fallback storage is enabled, shuffle data are 
migrated to other executors first. This causes shuffle data to be migrated 
multiple times. Only when no other executor is available for migration, shuffle 
data are migrated to the fallback storage. There should be a mode of operation 
where executors migrate there shuffle data to the fallback storage only, so 
there the data is migrated exactly once.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Unit test and manual test via [reproduction 
example](https://gist.github.com/EnricoMi/e9daa1176bce4c1211af3f3c5848112a/3140527bcbedec51ed2c571885db774c880cb941).
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to