[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245588#comment-13245588
 ] 

Suresh Srinivas commented on HDFS-3077:
---------------------------------------

Thanks for posting the design. Now I understand your comment that there is a 
lot of common things between this one and the approach in HDFS-3092. Here are 
some high level comments:
# Terminology - JournalDaemon or JournalNode. I prefer JournalDaemon because my 
plan was to run them in the same process space as the namenode. A JournalDeamon 
could also be stand-alone process.
# I like the idea of quorum writes and maintaining the queue. 3092 design 
currently uses timeout to declare a JD slow and fail it. We were planning to 
punting on it until we had first implementation.
# newEpoch() is called fence() in HDFS-3092. My preference is to use the name 
fence(). I was using version # which is called epoch. I think the name epoch 
sounds better. The key difference is that version # is generated from znode in 
HDFS-3092. So two namenodes cannot use the same epoch number. I think there is 
a bug with the approach you have described, stemming from the fact that two 
namenodes can use the same epoch and step 3 in 2.4 can be completed independent 
of quorum. This is shown in Hari's example.
# I prefer to record epoch in startLogSegment filler record. startLogSegment 
record was never part of the journal, which we had added for structural 
reasons. So adding epoch info to it should not matter. The way I see it is - 
journal belongs to a segment. Segment has single version # or epoch.
# In both proposals epoch or version # needs to be sent in all journal requests.

We could certainly make a list of common work items and create jiras, so that 
many people can collaborate and wrap it up, like we did in HDFS-1623.

                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to