HuangZhenQiu commented on PR #26640:
URL: https://github.com/apache/flink/pull/26640#issuecomment-4376535601

   Thanks @Zakelly for giving detailed suggestions. Let me give some 
observation from our production environment.
   
   1. Will reduce thread achieve a similar functionality? 
   Partially yes. From our observation, we have job with 1024 TMs. With thread 
1, we still see 1024 concurrent writes. With the jitter configured with 10 
seconds, without change thread number. We can reduce the concurrent writes to 
about 500 in average. From our limited experience, jitter still give user the 
flexible to further reduce the concurrent from number of TM to a lower number.
   
   2. Will support flow control in CheckpointStreamFactory and 
CheckpointStateOutputStream be better solution?
   I understand the your point of one place support all of the different state 
backend. It will be flow control solution for the state upload. What's the 
benefit of using JM to control it? In our production environment, we have tens 
of jobs writing to the same bucket for checkpoint. With the JM coordination, we 
are still not able to the cross job traffic control. I am wiling to get some 
detailed idea from you. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to