zjf2012 opened a new issue, #615: URL: https://github.com/apache/incubator-uniffle/issues/615
### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) ### Search before asking - [X] I have searched in the [issues](https://github.com/apache/incubator-uniffle/issues?q=is%3Aissue) and found no similar issues. ### What would you like to be improved? Both map and reduce tasks reference RssShuffleHandle wrapping 'partitionToServers' which is usually relatively far bigger than original task binary. E.g., we have a shuffle with 10,000 partitions. The 'patitionToServers' could easily reach to 250,000 bytes assuming each map entry has size of 25 bytes. Large task binary causes long task delay and task serialization time. We can replace it with something else like a mapping function to map partitions to shuffle servers. ### How should we improve? Instead, we can replace 'partitionToServers' with something else like a mapping function which map parition ID to shuffle servers. We only get shuffle servers once from the first shuffle task and cache them for later shuffle tasks with same shuffle ID per executor. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
