ChenSammi commented on issue #1469: HDDS-2034. Async RATIS pipeline creation and destroy through heartbea… URL: https://github.com/apache/hadoop/pull/1469#issuecomment-536527360 > > > @ChenSammi Thanks a lot for working on this! Please find my comments below. > > 1. For the SafeModeRules, if we allow pipeline creation during safe mode we need to modify the rules so that newly created pipelines are not counted in the rule. > > 2. Can we just trigger PipelineReport from the datanodes after creation of pipeline instead of CreatePipelineACK? That would greatly simplify the OPEN pipeline code. Thanks @lokeshj1703 for review the patch. 1. For SafeModeRules, I do modified the healthy pipeline rule a litttle bit. Add a > > > > For SafeModeRules, I do modified the HealthyPipelineSafeModeRule a bit. Add two properties, one is "hdds.scm.safemode.pipeline.creation" to control whether create pipeline in safemode. > > Another is "hdds.scm.safemode.min.pipeline" control the minimum pipeline number to exit safe mode when create pipeline in safemode is enabled. > > @ChenSammi The problem is safe mode rules are only tracking the old pipelines. But since they are listening to OPEN_PIPELINE event any newly created pipeline is counted in the rule. So if we are waiting for 50 old pipelines and 20 new ones are created, rule would pass if just 30 old pipelines are reported. Therefore I think we need a way to separate the old pipelines from new ones in the rules. Hi @lokeshj1703, I understand your point. Let me explain my thought from a different point of view. I think the purpose of safenode is to gurantee that Ozone cluster is ready to provide service to Ozone client once safenode is exited. As long as there are enough open pipelines to serve the read/write requst, Ozone can exit the safemode. If we want 50 open pipelines in a cluster to exit safenode, we may not care if they are new pipelines or old pipelines very much. There are datanodes up and down during the SCM start, what if some old pipelines are dead and lost for ever. New pipelines can replace these dead pipelines. Currently each datanode can only join one THREE factor RATIS pipeline, there will be very few new pipelines created after SCM restart. When multi-raft feature enabled, there is also a upper limit for how many pipelines each data can join. So basically if there is no new datanode join in, after SCM restart, majority is old pipeline, only a few new pipeline if possbile.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org