[jira] [Commented] (HDFS-11830) Ozone: Datanode needs to re-register to SCM if SCM is restarted

Weiwei Yang (JIRA) Wed, 17 May 2017 15:07:15 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014841#comment-16014841
 ]


Weiwei Yang commented on HDFS-11830:
------------------------------------

Hello [~msingh]

Thank you for helping to review. I have addressed most of your comments in v2 
patch except one

bq. We should also raise an exception if the endpoint is in any other state 
apart from HEARTBEAT.

We cannot raise an exception here because in test mode, if we set a short 
heartbeat interval, 1s for example. Datanode might not be able to fully transit 
to {{REGISTER}} state and it receives another response from SCM with 
{{reregisterCommand}} command. I think just ignore changing the state in this 
case should be fine. What do you think?

Thank you.

> Ozone: Datanode needs to re-register to SCM if SCM is restarted
> ---------------------------------------------------------------
>
>                 Key: HDFS-11830
>                 URL: https://issues.apache.org/jira/browse/HDFS-11830
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Critical
>         Attachments: HDFS-11830-HDFS-7240.001.patch, 
> HDFS-11830-HDFS-7240.002.patch
>
>
> Problem description:
> # Start NN, DN, SCM
> # Restart SCM and will see following warnings in SCM log
> 17/05/02 00:47:08 WARN node.SCMNodeManager: SCM receive heartbeat from 
> unregistered datanode
> Datanode could not re-establish communication with SCM afterwards. Propose to 
> fix this by adding a new command in HB handling telling datanode to 
> re-register with SCM. Datanode once received this command transits to 
> REGISTER state again to proceed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11830) Ozone: Datanode needs to re-register to SCM if SCM is restarted

Reply via email to