[ 
https://issues.apache.org/jira/browse/HDFS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2179:
------------------------------

    Attachment: hdfs-2179.txt

Here's a preliminary version of this. I've included the basic framework code, 
as well as two fencing implementations:
1) shell-command based fencing
2) ssh-based fencing that uses jsch to ssh into the target node and {{fuser}} 
to kill whatever process is holding onto the target port

This isn't at all integrated into the NN as of yet, since it's not clear what 
the hook points will be. But if this looks like the right path, I'd like to 
commit it to the HA branch, and we can adapt it to its integration points (eg 
failover controller) later.

> HA: namenode fencing mechanism
> ------------------------------
>
>                 Key: HDFS-2179
>                 URL: https://issues.apache.org/jira/browse/HDFS-2179
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-2179.txt
>
>
> In an HA cluster, when there are two NNs, the invariant that only one NN is 
> active at a time has to be preserved in order to prevent "split brain 
> syndrome." Thus, when a standby NN is transition to "active" state during a 
> failover, it needs to somehow _fence_ the formerly active NN to ensure that 
> it can no longer perform edits. This JIRA is to discuss and implement NN 
> fencing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to