EnricoMi opened a new pull request, #51201: URL: https://github.com/apache/spark/pull/51201
### What changes were proposed in this pull request? Allow to disable shuffle data migration to other executors thus only migrate shuffle data to fallback storage. ### Why are the changes needed? Currently, even though fallback storage is enabled, shuffle data are migrated to other executors first. This causes shuffle data to be migrated multiple times. Only when no other executor is available for migration, shuffle data are migrated to the fallback storage. There should be a mode of operation where executors migrate there shuffle data to the fallback storage only, so there the data is migrated exactly once. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test and manual test via [reproduction example](https://gist.github.com/EnricoMi/e9daa1176bce4c1211af3f3c5848112a/3140527bcbedec51ed2c571885db774c880cb941). ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
