[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands

ASF GitHub Bot (Jira) Sat, 28 Sep 2019 20:11:16 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=320102&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320102
 ]


ASF GitHub Bot logged work on HDDS-2034:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Sep/19 03:10
            Start Date: 29/Sep/19 03:10
    Worklog Time Spent: 10m 
      Work Description: ChenSammi commented on pull request #1469: HDDS-2034. 
Async RATIS pipeline creation and destroy through heartbea…
URL: https://github.com/apache/hadoop/pull/1469#discussion_r329335596
 
 

 ##########
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/BlockManagerImpl.java
 ##########
 @@ -188,6 +208,15 @@ public AllocatedBlock allocateBlock(final long size, 
ReplicationType type,
           // TODO: #CLUTIL Remove creation logic when all replication types and
           // factors are handled by pipeline creator
           pipeline = pipelineManager.createPipeline(type, factor);
+          // wait until pipeline is ready
+          long current = System.currentTimeMillis();
+          while (!pipeline.isOpen() && System.currentTimeMillis() <
+              (current + pipelineCreateWaitTimeout)) {
+            try {
+              Thread.sleep(1000);
+            } catch (InterruptedException e) {
+            }
+          }
 
 Review comment:
   This create pipeline in block allocation path is kind of debating.  A 
current comment sys "TODO: #CLUTIL Remove creation logic when all replication 
types and factors are handled by pipeline creator". 
   To the detail,  "ALLOCATED" state will be handled in task HDDS-2177, "Add a 
srubber thread to detect creation failure pipelines in ALLOCATED state".  
Currently the pipelineCreateWaitTimeout is calculated based on 
"hdds.command.status.report.interval" and "hdds.heartbeat.interval", under the 
condition that the connection between Datanode and SCM is in good state.  What 
if pipeline is created successfully, while the connection to SCM broken and 
restored after a while. Would we wait a little longer to decide whether 
pipeline creation success or failure.  So in HDDS-2177,  I plan to have a 
configurable property for the pipeline creation timeout.  Every ALLOCATED 
pipeline, which exceeds the creation timeout will be claimed failure and 
garbage collected. 
   Whether using CompleteFuture or while loop,  we all need a timeout.  This is 
on block allocation path,  how many latency can a synchronous API tolerate?  
Maybe the best way is not create pipeline in such case if we can make sure 
there are enough pipelines to use after  exiting safe mode.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 320102)
    Time Spent: 8.5h  (was: 8h 20m)

> Async RATIS pipeline creation and destroy through heartbeat commands
> --------------------------------------------------------------------
>
>                 Key: HDDS-2034
>                 URL: https://issues.apache.org/jira/browse/HDDS-2034
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Currently, pipeline creation and destroy are synchronous operations. SCM 
> directly connect to each datanode of the pipeline through gRPC channel to 
> create the pipeline to destroy the pipeline.  
> This task is to remove the gRPC channel, send pipeline creation and destroy 
> action through heartbeat command to each datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands

Reply via email to