[jira] [Updated] (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Fix Version/s: (was: 0.23.0) Edit log branch (HDFS-1073) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: Edit log branch (HDFS-1073) > > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed this to trunk. Thanks Ivan for the finishing touches and Konstantin for the several reviews. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Fix Version/s: 0.23.0 > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Attachment: HDFS-1521.diff > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Patch Available (was: Open) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Open (was: Patch Available) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Patch Available (was: Open) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, > hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Open (was: Patch Available) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, > hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Attachment: HDFS-1521.diff Merged shv's changes with the main patch. One minor change, in that storage is now passed to FSImageFormat directly so that getImageVersion doesn't have to reach back around namesystem.getImage().getStorage to get the version. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, > hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Patch Available (was: Open) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, > hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Open (was: Patch Available) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, > hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-1521: -- Attachment: FSImageFormat.patch Attaching a diff for FSImageFormat only. Let's make sure imgVersion in the image file and layoutVersion in VERSION file are the same and then use the latter. This lets us avoid passing imgVersion to several methods, and clarifies that imgVersion in the image is nothing but a redundant field. The rest looks good. +1 > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, > HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, > hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.txt I took a swing through Ivan's latest patch and looks good to me. I fixed up one javadoc error and added a new comment in one bit of code that needed some clarification. Aside from that, this patch is identical to Ivan's. +1 from me - Konstantin, would you mind taking one last look before we commit this? > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, > hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, > hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Attachment: HDFS-1521.diff Solved control op problem by simply not beginning a transaction if opcode > JSPOOL_START opcode. I think this is ready for submission now, unless there are more comments. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, > hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, > hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Patch Available (was: Open) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, > hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, > hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Status: Open (was: Patch Available) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Attachment: HDFS-1521.diff This patch addresses all Konstantin's comments except 11. There is something strange going on with lastAppliedTxid, that TestBackupNode isn't currently picking up. At line 177 change it to the following {code} // // Take a checkpoint // backup = startBackupNode(conf, op, 1); waitCheckpointDone(backup); for (int i = 0; i < 10; i++) { writeFile(fileSys, new Path("file_" + i), replication); } backup.doCheckpoint(); waitCheckpointDone(backup); {code} This will trigger the test to fail. The normal run of the test doesn't exercise convergeJournalSpool, so usually you don't see this. So, now you'll see that if BackupNode loads a checkpoint, and then tries to journal something, the lastAppliedTxid + 1 will be 1 even though we've loaded in an image and editlog. The simple fix is to put {code} lastAppliedTxId = getEditLog().getLastWrittenTxId(); {code} in loadCheckpoint(). This should be the end of the story. However, with this change, you get the error {quote} java.io.IOException: Expected transaction ID 10 but got 11 {quote} A transaction is going missing. Whats happening is, when doCheckpoint get kicked off, the log is rolled, and logJSpoolStart is called which creates an edit with opcode OP_JSPOOL_START. This opcode, is caught by EditLogBackupOutputStream and never transmitted to the backup node, so the transaction ids on the Primary and the Backup get out of sync. So, the question here is, is there any harm is actually transferring these OP_JSPOOL_START transactions, or are they just excluded as a precaution? > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, > hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-1521: - Attachment: HDFS-1521.diff Updated patch to apply on trunk. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, > hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.5.txt Thanks for the thorough review. Here's a new patch. bq. FSImage.loadFSEdits(StorageDirectory sd) should return boolean instead of FSEditLogLoader Fixed bq. You can avoid introducing FSEditLogLoader.needResave by setting expectedStartingTxId before checking that logVersion != FSConstants.LAYOUT_VERSION. Then the old logic of counting this event as an extra transaction will work I found the former logic here to be very confusing and somewhat of a hack. It's also important that the loader returns the correct number of edits rather than potentially returning 1 when there are 0 edits. If it did that, it would break many cases by potentially causing a skip in transaction IDs. Though the new code adds a new member, the new member has a clear purpose and I think it's easier to understand from the caller's perspective, especially now that your point #1 above is addressed. bq. It would be good if you could replace FSEditLogLoader.expectedStartingTxId member by the respective parameter to loadFSEdits bq. I think after that you can also get rid of FSEditLogLoader.numEditsLoaded. Fixed bq. Why don't we write first opCode, then txID, then Writable. There will be less code changes on the loading part Very good call! This indeed cleaned up the loading code a lot. bq. Should we introduce TransactionHeader at this point and write it as Writable. Just something to consider I think given that the header is still pretty simple it's not worth it at this point. bq. Need to change JavaDoc for EditLogOutputStream.write(). Missing parameter Fixed bq. I don't see any reason to have txID in the beginning of every edits file. You will have it the name, right bq. beginTransaction() instead of startTransaction, as it matches with endTransaction() Fixed. bq. Don't change rollEditLog() to return long. It is only used in the test It's necessary that the transaction ID be returned inside the same synchronization block. If we used a separate call to getLastWrittenTxId() then another txid could have been written in between (note that the test is multithreaded). bq. It looks to me that FSImage.checkpointTxId is simply currentTxId. If it is, it would be more intuitive It's not really current - it's the txid of the image file, not including any edits that have been written to the edit log - sort of like how checkpointTime is set only when an image is saved. Naming it "currentTxId" would imply that it is updated on every edit. bq. BackupStorage.lastAppliedTxId isn't it just checkpointTxId, which is already defined in the base FSImage. Contrary to above, lastAppliedTxId refers to the transaction ID that has been applied to the namespace. This is always >= checkpointTxId - checkpointTxId only changes when the BN saves an image, but lastAppliedTxId changes every time some edits are applied via RPC. I'll run the new patch through the unit test suite one more time. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, > hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-1521: -- Component/s: name-node > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.txt, > hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.4.txt New version addresses the TODOs, removes a couple places I had put unrelated notes to myself, and adds an explicit initialization of BackupStorage#lastAppliedTxId (per review comment by Sanjay offline) Will rerun unit tests just to be safe. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.txt, > hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.3.txt This patch switches to logging a txid for every edit and verifying strict sequential ordering on load. I also left the txid in the header - it seemed to me this is advantageous just as something that *must* be there at the top of every edit file. If others disagree we can take it out. Added some basic tests as well to ensure we can still read the old format. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.3.txt, hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.txt Patch fell out of date. Rebased patch. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.txt, hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Fix Version/s: 0.22.0 Status: Patch Available (was: Open) > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Attachment: hdfs-1521.txt This patch adds the transaction ID to the header of the edit log and image, and restores it when the NN loads a new one. > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1521.txt > > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1521: -- Description: For HDFS-1073 and other future work, we'd like to have the concept of a transaction ID that is persisted on disk with the image/edits. We already have this concept in the NameNode but it resets to 0 on restart. We can also use this txid to replace the _checkpointTime_ field, I believe. Affects Version/s: 0.22.0 > Persist transaction ID on disk between NN restarts > -- > > Key: HDFS-1521 > URL: https://issues.apache.org/jira/browse/HDFS-1521 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > > For HDFS-1073 and other future work, we'd like to have the concept of a > transaction ID that is persisted on disk with the image/edits. We already > have this concept in the NameNode but it resets to 0 on restart. We can also > use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.