[ 
https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970371#action_12970371
 ] 

Todd Lipcon commented on HDFS-1521:
-----------------------------------

The reasons I chose to use the txid in FSEditLog rather than coopt generation 
stamp were:
- We already had txid in FSEditLog, so it seemed strange to increment this as 
well as the generation stamp
- Incrementing FSN's generation stamp would add another cyclic dependency 
between FSEditLog back to the core of the NN, which we're trying to eliminate 
in other JIRAs.

As for deciding to put the txid in the header rather than on every record, I 
could go either way. I went with just the header because doing it in every 
record adds 8 bytes per edit, which would probably be 10% or so extra space 
overhead (likely causing 10% extra time spent loading the image too). I didn't 
benchmark it, but without any particular benefit, it didn't seem like it was 
worth the penalty.

One compromise might be to periodically add a "sync" record which includes the 
current transaction ID and perhaps some kind of magic number, kind of like what 
SequenceFile does. This would be handy for repair processes or even for running 
MR jobs on edit logs some day. Thoughts?

> Persist transaction ID on disk between NN restarts
> --------------------------------------------------
>
>                 Key: HDFS-1521
>                 URL: https://issues.apache.org/jira/browse/HDFS-1521
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: hdfs-1521.txt, hdfs-1521.txt
>
>
> For HDFS-1073 and other future work, we'd like to have the concept of a 
> transaction ID that is persisted on disk with the image/edits. We already 
> have this concept in the NameNode but it resets to 0 on restart. We can also 
> use this txid to replace the _checkpointTime_ field, I believe.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to