[ 
https://issues.apache.org/jira/browse/HDFS-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2074:
------------------------------

    Attachment: hdfs-2074.txt

Attached patch does the following:
- refactor the reading of the FSEditLog header into a new function/class, so 
that log verification can share more code with log application
- similarly push the construction of a checksumming stream into the Reader class
- move the old {{getValidLength}} implementation into the test code since it's 
no longer used by the non-test code paths
- change FSImageTransactionalStorageInspector to base its decision on which log 
to recover on how many valid txns are in the files

I'd like to move the Op.Reader, Op.LogHeader, and the validation code into a 
new class, but rather than combine the refactoring with this, I think it should 
be another JIRA.

> 1073: determine edit log validity by truly reading and validating transactions
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-2074
>                 URL: https://issues.apache.org/jira/browse/HDFS-2074
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: hdfs-2074.txt
>
>
> HDFS-2003 separated the deserialization/reading of log records from the 
> application of those records to a namesystem. This means that we can now read 
> through an edit log in order to determine how many valid transactions are 
> actually stored within. This is an improvement on what the 1073 branch 
> currently does, which is to simply look at how many bytes come before the 
> 0xFFFF... trailer at the end of the file.
> The next step after this is to use these new functions so that, when the NN 
> starts up and finds "in-progress" files like "edits_2_inprogress", it will 
> rename them to their finalized name like "edits_2-30" based on how many 
> transactions are truly stored within. This will simplify logic elsewhere.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to