timmylicheng commented on issue #668:
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-616920062


   @sodonnel Thanks for the efforts for benchmarking. That was a valid concern 
for large cluster in the future. I created 
https://issues.apache.org/jira/browse/HDDS-3466 to track this potential 
bottleneck.
   Say we allow 5 pipelines per DN, 10K pipelines require 10K*3/5 = 6K DNs.
   My team is testing how many DNs every OM and SCM can hold up. My takeaway is 
we are going to resolve quite a lot of issues on GRPC and Ratis before we move 
Ozone up to 6K DNs per SCM. So we may not be able to see this bottleneck in 
short term in prod cluster. However, we def wanna track this or do some testing 
for this to prevent it from being a blocker. 
   
   The major impact for this once being a bottleneck is createPipeline process 
would be long which could cause a slow cold start for SCM and DNs. On the other 
hand, 10K pipeline leads to many pipeline reports and container reports for 
single SCM. We may wanna limit overall pipeline counts per SCM ultimately.
   
   @fapifta Yea I agree with you. Let's use HDDS-3466 to track this and we are 
going to learn more about SCM's ability and best setup as we gradually move up 
to large cluster. My team is testing 200 DNs with single OM and SCM at this 
point (but in k8s env). Hopefully we can move further to have a clearer 
picture. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to