HHoflittlefish777 opened a new pull request, #35266: URL: https://github.com/apache/doris/pull/35266
## Proposed changes When some exception occur, such as `get offset failed ` for network isolation or be node down when upgrade, job will pause unexpectedly. Therefore, Doris introduce auto resume to keep job stable. But auto resume will resume job fast while the fault has not been restored yet, causing job pause again and can not auto resume, even if job can auto resume when fault is restore. Therefore, introducing backoff algorithm in auto resume to make auto resume schedule rule work better and make job more stable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org