xianjingfeng opened a new issue, #339: URL: https://github.com/apache/incubator-uniffle/issues/339
### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) ### Search before asking - [X] I have searched in the [issues](https://github.com/apache/incubator-uniffle/issues?q=is%3Aissue) and found no similar issues. ### What would you like to be improved? Now in `org.apache.uniffle.client.impl.grpc.ShuffleServerGrpcClient#sendShuffleData`, it will retry to send to one shuffle server for a long time and fail after reach `rss.client.send.check.timeout.ms`. Exception as follows: `Timeout: Task[2852_0] failed because 200 blocks can't be sent to shuffle server in 600000 ms.` This will cause that client will not send data to other servers. ### How should we improve? 1. Don't retry in `requirePreAllocation` and just retry in upper level 2. Set the default value of `rss.client.send.check.timeout.ms` to a smaller value, such as 10. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
