[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands

ASF GitHub Bot (Jira) Mon, 30 Sep 2019 04:53:25 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=320467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320467
 ]


ASF GitHub Bot logged work on HDDS-2034:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Sep/19 11:52
            Start Date: 30/Sep/19 11:52
    Worklog Time Spent: 10m 
      Work Description: ChenSammi commented on issue #1469: HDDS-2034. Async 
RATIS pipeline creation and destroy through heartbea…
URL: https://github.com/apache/hadoop/pull/1469#issuecomment-536527360
 
 
   > 
   > 
   > @ChenSammi Thanks a lot for working on this! Please find my comments below.
   > 
   >     1. For the SafeModeRules, if we allow pipeline creation during safe 
mode we need to modify the rules so that newly created pipelines are not 
counted in the rule.
   > 
   >     2. Can we just trigger PipelineReport from the datanodes after 
creation of pipeline instead of CreatePipelineACK? That would greatly simplify 
the OPEN pipeline code.
   
   Thanks @lokeshj1703 for review the patch.  
   1. For SafeModeRules, I do modified the healthy pipeline rule a litttle bit. 
 Add a 
   
   
   > 
   > 
   > > For SafeModeRules, I do modified the HealthyPipelineSafeModeRule a bit. 
Add two properties, one is "hdds.scm.safemode.pipeline.creation" to control 
whether create pipeline in safemode.
   > > Another is "hdds.scm.safemode.min.pipeline" control the minimum pipeline 
number to exit safe mode when create pipeline in safemode is enabled.
   > 
   > @ChenSammi The problem is safe mode rules are only tracking the old 
pipelines. But since they are listening to OPEN_PIPELINE event any newly 
created pipeline is counted in the rule. So if we are waiting for 50 old 
pipelines and 20 new ones are created, rule would pass if just 30 old pipelines 
are reported. Therefore I think we need a way to separate the old pipelines 
from new ones in the rules.
   
   Hi @lokeshj1703,  I understand your point.  Let me explain my thought from a 
different point of view.  I think the purpose of safenode is to gurantee that 
Ozone cluster is ready  to provide service to Ozone client once safenode is 
exited.  As long as there are enough open pipelines to serve the read/write 
requst,  Ozone can exit the safemode.  If we want 50 open pipelines in a 
cluster to exit safenode,  we may not care if they are new pipelines or old 
pipelines very much.  There are datanodes up and down during the SCM start,  
what if some old pipelines are dead and lost for ever.  New pipelines can 
replace these dead pipelines.  
   Currently each datanode can only join one THREE factor RATIS pipeline, there 
will be very few new pipelines created after SCM restart.   
   When multi-raft feature enabled, there is also a upper limit for how many 
pipelines each data can join. So basically if there is no new datanode join in, 
 after SCM restart, majority is old pipeline,  only a few new pipeline if 
possbile. 
    
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 320467)
    Time Spent: 11h  (was: 10h 50m)

> Async RATIS pipeline creation and destroy through heartbeat commands
> --------------------------------------------------------------------
>
>                 Key: HDDS-2034
>                 URL: https://issues.apache.org/jira/browse/HDDS-2034
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 11h
>  Remaining Estimate: 0h
>
> Currently, pipeline creation and destroy are synchronous operations. SCM 
> directly connect to each datanode of the pipeline through gRPC channel to 
> create the pipeline to destroy the pipeline.  
> This task is to remove the gRPC channel, send pipeline creation and destroy 
> action through heartbeat command to each datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands

Reply via email to