HuangZhenQiu commented on PR #26640: URL: https://github.com/apache/flink/pull/26640#issuecomment-4376535601
Thanks @Zakelly for giving detailed suggestions. Let me give some observation from our production environment. 1. Will reduce thread achieve a similar functionality? Partially yes. From our observation, we have job with 1024 TMs. With thread 1, we still see 1024 concurrent writes. With the jitter configured with 10 seconds, without change thread number. We can reduce the concurrent writes to about 500 in average. From our limited experience, jitter still give user the flexible to further reduce the concurrent from number of TM to a lower number. 2. Will support flow control in CheckpointStreamFactory and CheckpointStateOutputStream be better solution? I understand the your point of one place support all of the different state backend. It will be flow control solution for the state upload. What's the benefit of using JM to control it? In our production environment, we have tens of jobs writing to the same bucket for checkpoint. With the JM coordination, we are still not able to the cross job traffic control. I am wiling to get some detailed idea from you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
