[ 
https://issues.apache.org/jira/browse/HDFS-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246818#comment-13246818
 ] 

Todd Lipcon commented on HDFS-3192:
-----------------------------------

bq. So in step #6, irrespective of when ZKFC1 gets the notification, ZKFC1 has 
to restart NN1. Otherwise, we don't know as to how long NN1 will stay in limbo.

Can you explain why it has to restart, instead of just transitioning to 
standby? What do you mean by "in limbo" here?


bq. Also, NN1 could resign much earlier without having go through uncontrolled 
abort via fencing
Before issuing an "uncontrolled abort", the ZKFC2 will always try to do a 
"graceful fence" -- ie ask it to self-resign via an RPC. See the 
{{tryGracefulFence}} function in the {{FailoverController}} class.

Having the other node asking it to resign is better than having it ask itself 
to resign -- the reason being that this is the only way the other node can be 
sure that it's "in the clear" to start writing to the logs. (a 
"self-resignation" might come too late). Since the other node always has to 
verify the resignation before it starts to write, there's nothing extra gained 
by having it resign itself first. It's just a redundancy.


                
> Active NN should exit when it has not received a getServiceStatus() rpc from 
> ZKFC for timeout secs
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3192
>                 URL: https://issues.apache.org/jira/browse/HDFS-3192
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to