[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs

Todd Lipcon (JIRA) Tue, 26 Jul 2011 17:53:34 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071462#comment-13071462
 ]


Todd Lipcon commented on HDFS-1073:
-----------------------------------

I see your point now about the protocol naming. I'll changing it 
JournalProtocol.

bq. Document the architecture is important as it is the proof of correctness of 
the approach
If writing documents about code guaranteed the code were correct, our jobs 
would be a lot easier, wouldn't they? :) But yes, I'll clean up the doc after 
merging to make sure there's nothing inaccurate.

bq. I hope "longer" does not mean file length? 
In the case that the only logs available starting at a given transaction ID are 
named edits_inprogress_N, then we read through them to determine the "valid 
length" -- ie the number of valid transactions. A transaction is valid if it 
has a valid checksum, sequential transaction ID, etc. The one with the most 
valid transactions is chosen.

So, extra 0s or FFs on the end of a file won't affect the "valid length".

bq. Do you attempt to restore bad streams on rollEdits() as done by 
attemptRestoreRemovedStorage() in current implementation?
Yes -- each JournalManager creats a new OutputStream object when edits are 
rolled.

bq. ...OP_JSPOOL_START...
Yep, this is entirely eliminated now.

{quote}
Is it possible in your implementation that
a) BN already processed transactions with higher id than segmentTxId
b) BN hasn't seen yet transaction preceding segmentTxId
According to Precondition this should not be possible. What guarantees that?
{quote}

Because all of the calls to the BN go through JournalManager, and all of the 
calls are synchronous, the ordering won't get interleaved. That is to say, when 
an edit log is rolled, the startLogSegment() RPC call must respond before the 
next transaction can be journaled. And, before calling startLogSegment(), the 
previous log segment is flushed, guaranteeing that all previous edits "made it".

The Precondition is there just in case there's a bug that we missed -- this way 
we'll get a BN crash rather than something worse like silent data loss.

> Simpler model for Namenode's fs Image and edit Logs 
> ----------------------------------------------------
>
>                 Key: HDFS-1073
>                 URL: https://issues.apache.org/jira/browse/HDFS-1073
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 0.23.0
>            Reporter: Sanjay Radia
>            Assignee: Todd Lipcon
>             Fix For: 0.23.0
>
>         Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, 
> hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, 
> hdfs1073.pdf, hdfs1073.tex
>
>
> The naming and handling of  NN's fsImage and edit logs can be significantly 
> improved resulting simpler and more robust code.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs

Reply via email to