[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245578#comment-13245578
 ] 

Hari Mankude commented on HDFS-3077:
------------------------------------

Todd,

The doc is excellent. Had a comment on a potential issue which could result due 
to epochnumber with certain failure scenarios. Specifically, I am talking about 
the scenario in section 2.5.6

J1 is at txid 153, J2 is at txid 150 and J3 is at txid 125. Epochnumber on all 
the journals is 1. Both NN1 and NN2 are trying to become_active() at the same 
time. NN1 talks to J1, J2 and sets the proposedEpoch to 2. NN2 talks to J2 and 
J3 and decides to set the proposedEpoch to 2.

NN1 succeeds in setting newEpoch to 2 on J1 and fails on J2 and J3. NN1 dies 
since it does not have quorum.
NN2 succeeds in setting newEpoch to 2 on J2 and J3 and has the quorum. NN2 
cannot talk to J1. Similar to the scenario in 2.5.6, NN2 writes 151, 152,153 
into J2 and J3 and then dies.

So currently, state is epoch number is 2 on all the journals and J1, J2 and J3 
are at 153.  We have a problem since it is not possible to distinguish between 
log entries in J1 vs J2 and J3.

 


                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to