advancedxy opened a new issue, #825: URL: https://github.com/apache/incubator-uniffle/issues/825
### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) ### Search before asking - [X] I have searched in the [issues](https://github.com/apache/incubator-uniffle/issues) and found no similar issues. ### Describe the proposal Currently, the shuffle server assignment info(for spark) is created when the shuffle is registered on driver side. `RssShuffleHandleInfo` holds the assignment info and is immutable. Therefore, the shuffle server assignment info is static. This simplifies reader and writer implementation which sacrifices flexibility. When developing #477, there's one piece left to fully enable stage re-submit: regenerating the shuffle server assignments when the whole stage is re-submitted. In this proposal, I'd like to propose to support dynamic shuffle server assignments during stage resubmission or detecting bad shuffle servers on the fly. To archive that, there are some parts needs to be updated: 1. add new interfaces to ShufffleManagerGrpcService to generate shuffle server assignment dynamically 2. Corresponding ShuffleManagerClient 3. ShuffleRead/Write client and RssShuffleManager which eliminate the use of `RssShuffleHandleInfo` 4. Rethought/rework of reader's data check part 5. Maybe a unified and static shuffle client to share shuffle assignments, otherwise it would be costly to retrieve shuffle assignments for each task. ### Task list TBD ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
