[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120123#comment-13120123 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Mapreduce-trunk #850 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/850/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120109#comment-13120109 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-0.23-Build #29 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/29/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120089#comment-13120089 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-trunk #820 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/820/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120066#comment-13120066 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Mapreduce-0.23-Build #36 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/36/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119684#comment-13119684 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-trunk-Commit #1079 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1079/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119683#comment-13119683 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Common-trunk-Commit #1001 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1001/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119697#comment-13119697 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1021 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1021/]) Fix CHANGES.txt to include complete subtask list for HDFS-1073. Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082051#comment-13082051 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #23 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/23/]) Merge trunk into HDFS-1073. Resolved several conflicts due to merge of HDFS-2149 and HDFS-2212. Changes during resolution were: - move the writing of the transaction ID out of EditLogOutputStream to FSEditLogOp.Writer to match trunk's organization - remove JSPOOL related FsEditLogOp subclasses, add LogSegmentOp subclasses - modify TestEditLogJournalFailures to not keep trying to use streams after the simulated halt, since newer stricter assertions caused these writes to fail todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152128 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend2.java * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileStatus.java * /hadoop/common/branches/HDFS-1073/hdfs/src/docs/src/documentation/content/xdocs/hdfsproxy.xml * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestComputeInvalidateWork.java * /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/build.xml * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDatanodeDeath.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationDelete.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend3.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/JspHelper.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileConcurrentReader.java * /hadoop/common/branches/HDFS-1073/hdfs/src/docs/src/documentation/content/xdocs/site.xml * /hadoop/common/branches/HDFS-1073/hdfs/src/test/unit/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDecommission.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditsDoubleBuffer.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestLeaseRecovery2.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestMultiThreadedHflush.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdf
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079853#comment-13079853 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-trunk-Commit #812 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/812/]) HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295 Files : * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hdfs/ivy/libraries.properties * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java * /hadoop/common/trunk/hdfs/ivy.xml * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java * /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java *
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079834#comment-13079834 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-trunk #738 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/738/]) HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295 Files : * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hdfs/ivy/libraries.properties * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java * /hadoop/common/trunk/hdfs/ivy.xml * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java * /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/commo
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072995#comment-13072995 ] Konstantin Shvachko commented on HDFS-1073: --- I reviewed this some more and didn't find any outstanding issue. I am also +1. Good job, Todd! - Let's just fix those test failures. - For Java warnings, could you please add @suppress for the deprecations you mentioned. We generally should target zero warnings in the code. One general comment. The patch has started as fairly straight forward change in the structure of journal files, but end up changing many different parts essentially rewriting some major components of HDFS. Some people mentioned that in way it is an abuse of the idea of using dev branches for large changes. In the extreme it would look like somebody is making a change and piggybacking everything he ever wanted to do with the system. Not to criticize the work done, but to keep in mind in the future that there should be a fine balance between what is done in the trunk and what goes into the branch. E.g. refactoring is on the trunk, main logic on the branch. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072891#comment-13072891 ] Todd Lipcon commented on HDFS-1073: --- Unit tests passed except for a couple things which also had issues on trunk (eg TestHDFSCLI and a few timouts due to HDFS-2213 which did not reproduce when I reran the tests in question). I will commit this to trunk momentarily. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072884#comment-13072884 ] Matt Foley commented on HDFS-1073: -- I still believe that HDFS-2136 is very important. Please keep it on the post-merge cleanup list. Thanks. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072720#comment-13072720 ] Todd Lipcon commented on HDFS-1073: --- test-patch: {noformat} [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 148 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] -1 javac. The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings). [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.8) warnings. [exec] [exec] -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). [exec] [exec] +1 system test framework. The patch passed system test framework compile. {noformat} bq. -1 javac. The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings). This is just due to more references to SecondaryNameNode, which is officially deprecated. We now have much better test coverage, hence more deprecation warnings. bq. -1 release audit. This is for CHANGES.HDFS-1073.txt, which will be merged into CHANGES.txt when the svn trees are actually merged. Running unit tests now. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072681#comment-13072681 ] Todd Lipcon commented on HDFS-1073: --- Great, we now have +1s from the following committers: Jitendra, Eli, and Matt, plus an additional +1 from Ivan who has reviewed much of the code and is knowledgeable. So, this should be good to merge. If there is further review feedback I'll continue to address it in follow-up JIRAs. There is a bit of a conflict on the merge currently because of a couple patches that went into trunk. I will fix this and post a final merge patch this evening. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072625#comment-13072625 ] Ivan Kelly commented on HDFS-1073: -- +1 I've got a whole load of patches waiting to go on top of this, so the sooner it goes in the better :) > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072603#comment-13072603 ] Eli Collins commented on HDFS-1073: --- +1 Let's merge. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072498#comment-13072498 ] Matt Foley commented on HDFS-1073: -- +1. I have become confident that the merge should proceed. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072497#comment-13072497 ] Jitendra Nath Pandey commented on HDFS-1073: +1. I think the patch is in good shape and ready for merge. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072104#comment-13072104 ] Todd Lipcon commented on HDFS-1073: --- I've renamed BackupNodeProtocol to JournalProtocol, and renamed NNStorageArchivalManager to NNStorageRetentionManager in the branch. Any further comments? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071824#comment-13071824 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #22 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/22/]) Rename StorageArchiver to StoragePurger as suggested by Matt and Ivan in the comments on HDFS-1073 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1151192 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalFunctional.java > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071542#comment-13071542 ] Todd Lipcon commented on HDFS-1073: --- Great. I'll rename to NNStorageRetentionManager or something of that sort - probably tomorrow morning, along with the protocol renaming as suggested by Konstantin. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071540#comment-13071540 ] Matt Foley commented on HDFS-1073: -- I would phrase it that the "purge policy" determines how long to retain the files online, and what to do with them after you no longer want to keep them online - archive, delete, or whatever. So calling it Retention Policy and RetentionManager would be fine. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071530#comment-13071530 ] Todd Lipcon commented on HDFS-1073: --- My thinking was that the "archival policy" determines how things are kept vs archived vs deleted... maybe RetentionManager is better? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071523#comment-13071523 ] Matt Foley commented on HDFS-1073: -- Perhaps the rename of "Archival" to "Purge" should include the class names of NNStorageArchivalManager, TestNNStorageArchivalManager, and TestNNStorageArchivalFunctional. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071462#comment-13071462 ] Todd Lipcon commented on HDFS-1073: --- I see your point now about the protocol naming. I'll changing it JournalProtocol. bq. Document the architecture is important as it is the proof of correctness of the approach If writing documents about code guaranteed the code were correct, our jobs would be a lot easier, wouldn't they? :) But yes, I'll clean up the doc after merging to make sure there's nothing inaccurate. bq. I hope "longer" does not mean file length? In the case that the only logs available starting at a given transaction ID are named edits_inprogress_N, then we read through them to determine the "valid length" -- ie the number of valid transactions. A transaction is valid if it has a valid checksum, sequential transaction ID, etc. The one with the most valid transactions is chosen. So, extra 0s or FFs on the end of a file won't affect the "valid length". bq. Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation? Yes -- each JournalManager creats a new OutputStream object when edits are rolled. bq. ...OP_JSPOOL_START... Yep, this is entirely eliminated now. {quote} Is it possible in your implementation that a) BN already processed transactions with higher id than segmentTxId b) BN hasn't seen yet transaction preceding segmentTxId According to Precondition this should not be possible. What guarantees that? {quote} Because all of the calls to the BN go through JournalManager, and all of the calls are synchronous, the ordering won't get interleaved. That is to say, when an edit log is rolled, the startLogSegment() RPC call must respond before the next transaction can be journaled. And, before calling startLogSegment(), the previous log segment is flushed, guaranteeing that all previous edits "made it". The Precondition is there just in case there's a bug that we missed -- this way we'll get a BN crash rather than something worse like silent data loss. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071370#comment-13071370 ] Konstantin Shvachko commented on HDFS-1073: --- Checked the journalling code. Generally it looks good. Some comments: > Since it's only used for transferring edits ... Exactly the point. How you extracted it its only used for journaling. It is different from DatanodeProtocol as it does not have e.g. register(), so it is not a complete BackupNode protocol. Also the same journal protocol can be used in StandbyNode, and it would be then confusing to have BackupNode in the name. So I would either rename it to JournalProtocol or would not factor this protocol out at all: keep methods inside NameNodeProtocol. The latter makes sense for me if as you proposed different NameNodes will be the same entity with different roles or states. > I see the main purpose of the design doc to guide development, rather than to > document the architecture after it's done. Document the architecture is important as it is the proof of correctness of the approach. Also I bet in a couple of months you will not remember all the details and will need that design doc to refresh your mind. I will. > On disk, we can distinguish it from the non-failed streams since the > non-failed streams will be longer I hope "longer" does not mean file length? Besides that, your explanation seems reasonable. Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation? > I removed it as well as the isOperationSupported() Good point. Just noticed the same thing. You should also be able to eliminate OP_JSPOOL_START as you don't have FSEditLog.logJSpoolStart() method anymore. Having said that, how do you determine when to roll edits on the BackupNode without logJSpoolStart()? Explaining. In current implementation OP_JSPOOL_START is sent as a part of the journal stream, so BN knows exactly after which transaction the edits should be rolled. In your implementation logJSpoolStart() is replaced by startLogSegment(segmentTxId). Is it possible in your implementation that a) BN already processed transactions with higher id than segmentTxId b) BN hasn't seen yet transaction preceding segmentTxId According to Precondition this should not be possible. What guarantees that? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071260#comment-13071260 ] Todd Lipcon commented on HDFS-1073: --- OK. I committed the rename to the branch in r1151192. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070956#comment-13070956 ] Matt Foley commented on HDFS-1073: -- Yes, that would be okay. Thanks. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070748#comment-13070748 ] Todd Lipcon commented on HDFS-1073: --- Matt: How about changing it to StoragePurger, etc, as Ivan suggests? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070745#comment-13070745 ] Todd Lipcon commented on HDFS-1073: --- Committed a number of small fixes to the branch to address the following from Konstantin's review: bq. EditLogOutputStream does not extend OutputStream, what is the reason for that? EditLogOutputStream is moving towards being more like a "sink for journal records" rather than a straight output stream. That is to say, the abstraction is a sequence of edits, rather than a sequence of bytes. So extending OutputStream wasn't really buying us anything. Good point about the javadoc inheritance. I added JavaDoc now that it it is its own class. I also removed the {{write(int)}} API which was only used internally. bq. BackupImage.BNState - convert field descriptions to JavaDoc. Fixed. bq. BackupNodeProtocol would it be better to call it JournalProtocol Since it's only used for transferring edits to the BackupNode (and not any other type of journaling) I think the current name makes more sense. It's also more consistent with the other protocols like DatanodeProtocol and NamenodeProtocol. bq. In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant... Fixed. I also noticed that OP_JSPOOL_START was referenced in the code in a few places, but no longer necessary. I removed it as well as the isOperationSupported() call which was no longer necessary. bq. JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc. Fixed. bq. FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something. Good idea. I renamed it to PreTransactional. bq. Good design doc, but somewhat outdated. Do you plan to update it some time? I see the main purpose of the design doc to guide development, rather than to document the architecture after it's done. I will try to update any places where it's grossly inaccurate after we've merged this to trunk, though. bq. FSEditLog.JournalAndStream should not have public methods if possible. Fixed. {quote} I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable. Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful. {quote} I looked into doing this, but it wasn't straightforward, since we don't always need to download the image. Would it be alright to address this after the merge? I can file a JIRA so it doesn't get forgotten. bq. When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something? I'm not sure I follow. We only detect that a stream is bad when we're syncing, so we're syncing the other ones at the same time anyway. We know that stream is bad because {{JournalAndStream.isActive}} returns false after the stream is aborted. On disk, we can distinguish it from the non-failed streams since the non-failed streams will be longer during log recovery. See {{TestFSImageStorageInspector.testLogGroupRecoveryInProgress}} as well as some of the crash-recovery related test cases in {{TestFSEditLog}}. bq. TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint We do validate that the request belongs to the correct namespace by passing the namespaceID/clusterID/etc. See GetImageServlet.java:92 and {{TestCheckpoint.testReformatNNBetweenCheckpoints}}. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070405#comment-13070405 ] Ivan Kelly commented on HDFS-1073: -- {quote} Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan(). {quote} HDFS-2018 renames this to {code} void purgeTransactions(long minTxIdToKeep) throws IOException; {code} to match the design doc for HDFS-1580 > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070320#comment-13070320 ] Matt Foley commented on HDFS-1073: -- I like the new model, and I like the way upgrade is handled. In NNStorageArchivalManager and JournalManager: I have difficulty with naming a method or interface "archiver" and then implementing it as "delete". That seems an incorrect abstraction. How about changing the name of the i/f StorageArchiver to "StorageDisposition" with methods "disposeLog" and "disposeImage"? Then it could have implementations DeletionStorageDisposition and ArchiverStorageDisposition, without prejudice. Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan(). I'm not set on the specific word choice "disposition" or "dispose", but I think that if an allowable implementation of an interface is deletion, then it shouldn't be named "Archive". > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070307#comment-13070307 ] Konstantin Shvachko commented on HDFS-1073: --- Looked at the checkpoint code. Comments: # EditLogOutputStream does not extend OutputStream, what is the reason for that? If it does not, then there is nowhere to inherit javaDoc from, which it does. # BackupImage.BNState - convert field descriptions to JavaDoc. # BackupNodeProtocol would it be better to call it JournalProtocol? # In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant. {{ // must be same as NamenodeProtocol.JA_JSPOOL_START}} # JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc. # FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something. # Good design doc, but somewhat outdated. Do you plan to update it some time? # FSEditLog.JournalAndStream should not have public methods if possible. # I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable. Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful. # When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something? # TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint. If it is hard to do it in this jira, let's open a new one (if not opened already). Will look at journalling next. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070203#comment-13070203 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #19 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/19/]) Move SecondaryNameNode.rollForwardByApplyingEdits to Checkpointer. Remove unused code in EditLogBackupInputStream In response to Konstantin's review at: https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13070021&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13070021 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150241 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070077#comment-13070077 ] Todd Lipcon commented on HDFS-1073: --- Thanks for the comments. I did another sweep for unused imports and moved rollForwardByApplyingLogs like you suggested. I committed these changes to the branch. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070021#comment-13070021 ] Konstantin Shvachko commented on HDFS-1073: --- Did a quick sweep over warnings: # Remove unused imports: CheckpointSignature, EditLogFileOutputStream, FSEditLogLoader, TransferFsImage, NamenodeProtocol, NamenodeRegistration, NamespaceInfo, EditsLoaderCurrent, ImageLoaderCurrent. # Unused code: EditLogBackupInputStream.ByteBufferInputStream.getData() # Move SecondaryNameNode.rollForwardByApplyingLogs() into Checkpointer to avoid deprecation warning and to ease removal of SNN in the future. Looking further. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068631#comment-13068631 ] Konstantin Shvachko commented on HDFS-1073: --- The benchmarks look good to me. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068617#comment-13068617 ] Todd Lipcon commented on HDFS-1073: --- bq. This looks good to me To be clear, do you mean the benchmark results, or the merge? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068613#comment-13068613 ] Konstantin Shvachko commented on HDFS-1073: --- SSD isn't practical now. Heap should fit the namespace. Thanks. This looks good to me. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068612#comment-13068612 ] Todd Lipcon commented on HDFS-1073: --- Replying to Suresh [above|https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13068216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13068216] regarding CHANGES.txt: I agree that our current protocol isn't great. The problem with the way federation was done is that, although federation is a New Feature, many of the subcomponents were just bug fixes against the federation branch. Would it be OK with you if I did the following? - add in the NEW FEATURES section: "HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Please see HDFS-1073 section below for breakout of individual patches." - add a new section called "BREAKOUT OF HDFS-1073 Subtasks" with the contents of CHANGES.HDFS-1073.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068585#comment-13068585 ] Todd Lipcon commented on HDFS-1073: --- I ran NNThroughputBenchmark with 100 threads, 10 ops, three runs with trunk and three runs with 1073. The machine is Xeon E5540 at 2.53GHz, 8 cores w/ HT enabled. The edits disk is a single local SATA 7200rpm. Here are the mean ops/sec for the various mutations: ||op||trunk||1073|| |create|5060|4993| |open|28120|28950| |delete|5552|5468| |rename|5455|5451| Looking at the FSEditLog log, the mean numbers are: ||stat||trunk||1073|| |Total time for txns|6926|6822| |txns batched|1077882|1077876| |number of syncs|22217|6| |SyncTimes|202537|204165| To summarize, there is a small hit on the write operations, since they now log more data. This also shows up in the higher SyncTimes. The read ops are unaffected (open actually benchmarked faster in 1073, but it had fairly high variance) > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068561#comment-13068561 ] Todd Lipcon commented on HDFS-1073: --- Any preference whether the benchmark results are from a machine with SSD vs not? How much heap size do you typically configure for NNThroughputBenchmark? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068555#comment-13068555 ] Konstantin Shvachko commented on HDFS-1073: --- Todd could you please post your benchmark results. I usually run NNThroughputBenchmark with a variety of threads from 100 to 1000. You can choose whatever # is optimal when you post the results. We are mostly interested in operations like create, rename, delete, which update edits. But we also should verify that open and blockReport are the same as before as they are not edits related. I plan to review the changes with a focus on BN part. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068461#comment-13068461 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #15 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/15/]) HDFS-2172. Address findbugs and javadoc warnings in HDFS-1073 branch. Contributed by Todd Lipcon. HDFS-2170. Address remaining TODOs in HDFS-1073 branch. Contributed by Todd Lipcon. Merge trunk into HDFS-1073 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148592 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/findbugsExcludeFile.xml * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148589 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148533 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteConfigurationToDFS.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/hdfsproxy * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestAbandonBlock.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/ClusterJspHelper.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068216#comment-13068216 ] Suresh Srinivas commented on HDFS-1073: --- > This will generate one RAT warning due to CHANGES.HDFS-1073.txt. What's the > best way to integrate the changelist into CHANGES.txt? Should I dump the > entire list in, or just a single entry for HDFS-1073? Or perhaps a single > entry for HDFS-1073 and then a section lower in the same CHANGES.txt file > that itemizes it? In federation I dumped the entire list into CHANGES.txt, with Federation: tag in front of each change. Our Changes.txt protocol is woefully inadequate. Recording trivial jiras in CHANGES.txt dilutes its value for people looking for important changes that is part of a release. Given that every thing gets recorded in it, I decided to add all the entries. On a separate note, we should rethink our policy of adding every change to CHANGES.txt. At least we should consider adding tags: trivial, minor, major, critical, incremental for easier consumption. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068126#comment-13068126 ] Hadoop QA commented on HDFS-1073: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487094/hdfs-1073-merge.patch against trunk revision 1148348. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 134 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/973//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/973//console This message is automatically generated. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073-merge.patch, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, > hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068110#comment-13068110 ] Todd Lipcon commented on HDFS-1073: --- TestEditLogFileOutputStream was failing because it depended on the exact byte length of a mkdirs op. I'd written it with username 'todd', whereas the test ran with username 'hudson' - hence the mkdirs had a longer username and took 2 more bytes when running in the build. I just committed a small change (r1148591) to the test to only verify that the log length increased, rather than some specific length. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068105#comment-13068105 ] Todd Lipcon commented on HDFS-1073: --- bq. -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). This is due to many new test cases that use SecondaryNameNode (all the new warnings are just this deprecation warning). I'd like to consider undeprecating it, since it's very well tested and I think we still intend to recommend its use in 0.23. bq. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). This is the CHANGES.HDFS-1073.txt file. Please see above question -- how should we integrate all of the subtask changelog items in the main CHANGES.txt? bq. org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream This passes on my box. I'll try to log in to the Hudson servers to see what's up. bq. org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer This is failing because it requires a new binary file to be committed. It passes on the branch where the file is committed -- just not represented in the patch. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068074#comment-13068074 ] Hadoop QA commented on HDFS-1073: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487071/hdfs-1073-merge.patch against trunk revision 1148348. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 134 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/972//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/972//console This message is automatically generated. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Fix For: 0.23.0 > > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073-merge.patch, > hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067956#comment-13067956 ] Todd Lipcon commented on HDFS-1073: --- Hi Jitendra. I've addressed your feedback above: bq. TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored. Committed a trivial fix in r1148480 bq. TestEditLog.java#testFailedOpen is disabled. Addressed by HDFS-2168 bq. Commented out code in a few places with TODOs. Fixed in HDFS-2170, HDFS-2169, and HDFS-2160 Please let me know if you have more feedback. I agree that 2018 will continue to improve the code, but as discussed on the list I think we should merge this, and then take care of 2018 and 1580, and do a second merge. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067806#comment-13067806 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #14 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/14/]) HDFS-2160. Fix CreateEditsLog test tool in HDFS-1073 branch. Contributed by Todd Lipcon. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148070 Files : * /hadoop/common/branches/HDFS-1073/hdfs/bin/hdfs * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066352#comment-13066352 ] Todd Lipcon commented on HDFS-1073: --- good catches in TestEditLog. Are you sure you're looking at the latest version of the branch, regarding TestBackupNode? That function was filled in by HDFS-1979 which I committed yesterday morning I believe. I'll do another sweep for TODOs I might have missed. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066338#comment-13066338 ] Jitendra Nath Pandey commented on HDFS-1073: > Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? Sounds good. A few more minor comments: TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored. TestEditLog.java#testFailedOpen is disabled. TestBackupNode.java : waitCheckpointDone does nothing. Commented out code in a few places with TODOs. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065801#comment-13065801 ] Ivan Kelly commented on HDFS-1073: -- {quote} LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? {quote} HDFS-2149 will probably remove LogHeader completely. I plan to add a getVersion() call to InputStreams and each stream will handle it's own metadata internally. So EditLogFileInputStream will read it's version on creation, or first call to read etc. The input and output stream will be packet based, so an input stream is basically an iterator over FSEditLogOp objects and output stream is a sink for FSEditLogOp objects. I think the way I've implemented the FSEditLogOp objects should avoid all extra copies and object creation. Whats more, there's plenty to room to improve this by removing the creation of ArrayWritables and DeprecatedUTF8 objects and just write strings and arrays directly. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065669#comment-13065669 ] Hudson commented on HDFS-1073: -- Integrated in Hadoop-Hdfs-1073-branch #9 (See [https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/9/]) Small javadoc and unused imports cleanup in response to Jitendra's review See https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13064221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13064221 Merge trunk into HDFS-1073 HDFS-2135. Fix regression of HDFS-1955 in HDFS-1073 branch. Contributed by Todd Lipcon. HDFS-2133. Address remaining TODOs and pre-merge cleanup on HDFS-1073 branch. Contributed by Todd Lipcon. Amend HDFS-2011 for HDFS-1073 branch. Update test cases for new behavior of EditLogFileOutputStream. Contributed by Todd Lipcon and Eli Collins. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146889 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146881 Files : * /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Host2NodesMap.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestHost2NodesMap.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/hdfsproxy * /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/UpgradeObjectDatanode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/Host2NodesMap.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java * /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/hdfs * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java * /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/secondary * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java * /hadoop/common/branches/HDFS-1073/hdfs * /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Decommission
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065526#comment-13065526 ] Todd Lipcon commented on HDFS-1073: --- bq. EditLogFileInputStream doesn't have any change except for an unused import. good catch, fixed the import {quote} EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. {quote} Agreed. This is used by the BackupNode which currently receives only byte arrays which have to be journaled, rather than logical transaction records. I added a javadoc which explains its purpose, and renamed the offset parameter. {quote} What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? {quote} I remember discussing this at one point on JIRA, but I can't seem to find the comment. I think it was either Sanjay or Rob Chanselor who had suggested that we later extend these opcodes to have a bit of extra information such as the timestamp, the hostname, the namespace ID, etc. They would serve as extra sanity checks and possibly be useful for debug/audit/etc. Of course right now they don't do a whole lot, but I think they are still useful during "forensics" -- eg when I'm looking at a log file in a hex editor, it would be nice to see one of these transactions at the end to know that it didn't somehow get truncated. Race condition bugs around rolling, like we've seen before, would also be a lot more obvious. bq. LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? bq. writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately This was added in HDFS-1801, with the rationale in [this comment|https://issues.apache.org/jira/browse/HDFS-1801?focusedCommentId=13026872&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13026872]. Basically it adds an extra safeguard so that if the last edit logs are somehow lost (or unavailable at startup), the storage directories will have enough info to detect it and prevent the NN from starting. bq. There are unused imports in a few files. Yep, thanks. Attached patch fixes most of them. bq. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code. Let's continue to discuss there. I addressed the unused imports and javadoc fixes on the branch in r1146889. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064221#comment-13064221 ] Jitendra Nath Pandey commented on HDFS-1073: A few comments: 1. EditLogFileInputStream doesn't have any change except for an unused import. 2. EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. I guess it would be cleaned up with backup node fix. Please change the parameter name 'i' to offset. 3. FSEditLog.java: What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? 4. FSEditLogOp.java - LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? 5. NNStorage.java - writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately. 6. There are unused imports in a few files. 7. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063992#comment-13063992 ] Todd Lipcon commented on HDFS-1073: --- Hairong asked me to comment describing what testing I've done on the branch. Here's a summary: - Lots of new unit tests -- about 3600 lines of net new test code, ~1000 lines updated. Total of 56 new test cases by my grepping. - Stress testing of 2NNs: -- Case 1: Start one NN with two data dirs. Start two 2NNs configured with checkpoint period of 0 (checkpoint as fast as possible). Let it loop for several hours to make sure nothing crashes. -- Case 2: Start one NN with two data dirs, one of which is on a filesystem mounted on top of software RAID configured in "faulty" mode. Set the "faulty" RAID driver to throw an IO error every 10,000 reads. Start 2NN with checkpoint period 0, run for several minutes, making sure the injected IO errors are handled correctly. Eventually the ext3 filesystem ends up remounting itself as read-only. fsck and remount the filesystem while the NN is running, make sure it can be restored correctly -- Both of the above tests are run while a separate program with 10 threads pounds "mkdirs" and "delete" calls into the NN as fast as it can. - Stress testing of BN: - Start NN. Start load generator (spamming mkdirs and delete calls) - Start BN with checkpoint configured once a minute. - Periodically stop load generator, issue mkdirs on NN and BN and make sure results are identical. - Take md5sum of files in BN's name dir, NN's namedir - verify that MD5s match. - Resume load generation. The above testing yielded a couple of bugs which I then converted to functional tests to prevent regressions. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063638#comment-13063638 ] Todd Lipcon commented on HDFS-1073: --- Konstantin: good idea about running NNThroughputBenchmark. Do you have a preferred configuration you can suggest (ie in terms of number of threads, etc?) Initial results indicate there's a few % slowdown for operations which sync edit logs, because the edit log entries are now each 8 bytes longer given they include a transaction ID. Read operations seem unaffected, given the changes don't touch those code paths. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061833#comment-13061833 ] Konstantin Shvachko commented on HDFS-1073: --- I would also like to ask for some benchmarks to make sure we do not loose in performance for NN operations. NNThorughtput is applicable in the case. But other tests are welcome as well. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031268#comment-13031268 ] Sanjay Radia commented on HDFS-1073: Todd, very good document; the effort you have put in clearly shows. It will serve as design doc for this very critical part of the NN. I have a few minor suggestions to improve the document. * Add motivation section (this jira has most of the stuff you need). I would include the following ** Decouple the image and edits file naming to reduce code complexity when a new checkpoint is added ** Allow for secondary and backup NN to generate a checkpoint without coordinating with NN. ** Allow for NN to trigger the checkpoint rather then the secondary/backup ** Allow one to implement an offline checkpointer * Mention somewhere that we need to add NN shutdown command. Show that your design can accommodate it. I would prefer that in the case of shutdown, no edits log file is created and hence we could have the option of not worrying about changes in edits opcodes during upgrade (see the discussion on HDFS-1822) * Section 4.4 - "group" you really mean "the edits across all dirs" - clarify. 4.5 - "Open question - upgrade when there wasn't a clean shutdown" -- INMHO No we do not need to support it. I prefer that there should be a clean shutdown (see my comment about) - not just save namespace. I will comment separately on the test cases. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029574#comment-13029574 ] Eli Collins commented on HDFS-1073: --- How about uploading the tex file to jira, easier for others to diff drafts and make edits. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029555#comment-13029555 ] Todd Lipcon commented on HDFS-1073: --- For those who want a diffable view of the evolution of the document, I pushed my repository here: https://github.com/toddlipcon/hdfs-1073-design/commits/master > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029541#comment-13029541 ] Todd Lipcon commented on HDFS-1073: --- Thanks for the comments. I'll address them and upload a new draft. bq. Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that? Yes, I think so. In the code currently, the CN and the 2NN are essentially different implementations of the exact same thing. I can't think of any reason that an operator would want to run the old implementation. Removing the 2NN would also allow us to concentrate our testing on just one of the implementations (right now the CN isn't well covered by tests) Sanjay, do you agree? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029493#comment-13029493 ] Eli Collins commented on HDFS-1073: --- Hey Todd, Here's my feedback on the design doc. Great writeup. * Putting txn IDs in the file names does feel like the right call * Would be helpful to have a short problem statement section, eg why the new structure is less error prone, how enabling shared-storage for metadata enbales HA, etc. * Section 1.2, is the new OP_INVALID filler relevant here since a journal will also contain these txns in practice? * Section 3.1, part 5, think this is useful to expose as an admin function. It's good to decouple log rolling from saving the namespace from an administrator's perspective. * Section 4.1, one sentence defining log recovery would be helpful. * Section 4.4 Step 5. I think should load/apply edits_inprogress_Q, not just open it. * Section 4.5. For your open question, I don't think we should support *upgrade* from a namespace that was not cleanly shut down. Ie let's restrict the space of logs an upgrade needs to deal with, the admin start and cleanly shutdown the Namenode before upgrading, which seems reasonable to require, and should be the common case anyway. * Section 6, bullets 2 and 4, should we use CheckpointNode here and throughout the doc to be consistent? * Section 6, bullet 8, can remove, this is already done. * Section 7.1, link is broken. * Section 7.6, s/will be/will/ * Section 8, clarify that N is the number of *past* images, there also needs to be N saved images if given N image directories. * Section 8, bullet 3. Strongly agree, the rolling should't be articulated in terms of file deletion, ie something generic like move, archive, or "trash" seems better. * Section 9.1. Agree the BN should trigger checkpoints via an op. Please add a note as to why that's better than the current approach. * Section 9.2. The CheckpointNode could be modified to use this tool, this will help make it more generally useful, eg for performing checkpointing for any number of namenodes, vs being a CheckpointNode for a given Namenode. Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that? Thanks, Eli > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029139#comment-13029139 ] Todd Lipcon commented on HDFS-1073: --- Status update: Merged federation into this branch. Next subtasks up and ready for commit are HDFS-1892, HDFS-1799, HDFS-1800, HDFS-1801. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021919#comment-13021919 ] Todd Lipcon commented on HDFS-1073: --- I'm back from vacation and just merged trunk into the development branch, so that the branch compiles again. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017286#comment-13017286 ] Todd Lipcon commented on HDFS-1073: --- For those following the work on this branch: I will be on vacation tomorrow through 4/18, and back on this as a top priority starting 4/19. Thanks! > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015578#comment-13015578 ] Sharad Agarwal commented on HDFS-1073: -- I am out till April 11 and will get back to you on my return. Thanks > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015573#comment-13015573 ] stack commented on HDFS-1073: - Just to say that I've started following along on this issue but its kinda hard to figure the plan reading the above comments alone (I'm sure I missed a few of the switchbacks reading through). Any chance of the design doc. getting updated to reflect what was agreed -- it doesn't seem to match -- and whats being pursued out on the branch? Thanks. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015574#comment-13015574 ] stack commented on HDFS-1073: - Oh, I'm asking because I'm trying to help out. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012343#comment-13012343 ] Todd Lipcon commented on HDFS-1073: --- As discussed on the mailing list, I've created a branch for this JIRA and its subtasks. This will make intermediate review easier. Any interested parties, please watch the branch and the subtasks of this JIRA. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008682#comment-13008682 ] Sanjay Radia commented on HDFS-1073: Noticed the progress on HDFS-1521. Todd, are you planning to add a subtask where the actual edits and fsimage files are named using the txId or will this be part of this Jira itself? Any intermediate patch on this part for review? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003236#comment-13003236 ] Todd Lipcon commented on HDFS-1073: --- Progress is coming along on this issue. I've pushed a git branch for the work-in-progress here: https://github.com/toddlipcon/hadoop-hdfs/tree/hdfs-1073-march I've based this branch on top of HDFS-1521 and HDFS-1538 > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, > hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933198#action_12933198 ] Sanjay Radia commented on HDFS-1073: On the "roll transaction" and the "quit transaction" we can add info such as # of files, dirs, blocks etc in the NN. This can be useful for sanity checking and testing. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930238#action_12930238 ] Sanjay Radia commented on HDFS-1073: > if anyone may ask for a roll - ie CN, BN, or NN. ... This is a little ambiguous. Let's clarify: * Longer Term - we may need to some time to get here since we want to minimize the changes to the BNN protocol. ** An admin can ask for a roll. ** NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter). ** A NewCheckpointer which is given a set of fsimages and edits and it creates a new checkpointed fsimage. The new fsimage is then copied OFFLINE to the NN, or wherever else we want it (say NFS, or HDFS). ie there is NO protocol between the NN and the NewCheckpointer. When this NewCheckpointer is available we can deprecate the old CheckpointNN (CN). ** BNN does NOT ask for a roll - it simply observes the rolls by the "roll transaction". - Does this work or have I misunderstood the design of the BNN? * Shorter term (ie this Jira and release 22) ** An admin can ask for a roll ** NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter). ** BNN can ask for roll since this is already part of the protocol -- btw if this can be avoided in this jira then good but it may be too much change. ** The CN (really a variation of the BNN) can ask for a roll. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930023#action_12930023 ] Konstantin Shvachko commented on HDFS-1073: --- Rob, I seems we cannot have the "I'm quitting!" record just because there is no "quit" or "shutdown" command. I agree the "rolled" transaction can be useful for a sanity check for the edits files that are not in-progress. Todd, based on the design doc (I should have read first thing) I don't see much difference between the current and your new implementation ascept that you don't need a side file to write the edits while spooling. Currently BN.startCheckpoint() causes NN.rollEdits(), which in turn sends back to BN the SPOOL_START record. This is when BN starts spooling. You seem to be trying to call the process of spooling (writing into edits file but not applying to memory) by journal. That is how the state is called in your design, right? Which may be confusing as BN continues journaling (writing to edits file) whether it is in synchronized or in spooling mode. Also I don't see how you can get by with only 2 states for BN you need 3. While spooling there are 2 active threads: one (writer) is writing edits from NN directly to the edits_K, another (reader) is reading formerly written records from edits_K. At the end we need to switch the writer thread from writing to applying the records to in memory state and shut down the second thread. This is where you need the third state, currently called WAIT. When the reader thread reaches the end of file it sets the WAIT state. The writer may still be writing before it sees the WAIT state. After seeing WAIT it blocks and waits until spooling is OFF. The reader read the remaining records and turns the spooling OFF. Let me summarize the current journal spool states meaning - JSpoolState.OFF - spooling is off, apply edits to memory state and write into journal (edits file) - JSpoolState.INPROGRESS - spooling in progress, do not apply to memory, just journal - JSpoolState.WAIT - stop, do nothing wait until spooling is OFF Does that make sense? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928794#action_12928794 ] Robert Chansler commented on HDFS-1073: --- bq. Worked so far. How would you know? I just feel better having some check that the log is complete, especially in the new world where the log is a sequence of files. It's conceivable that not only could the last log file be truncated, any number of log _files_ at the end of the log could be missing entirely. Of course, if the log files were being written to a more robust file system like HDFS, the need for integrity checks would be less. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928764#action_12928764 ] Todd Lipcon commented on HDFS-1073: --- Hey all. Back in town after a few weeks in Japan, sorry for the relative absence. bq. I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far. I think one complication here is that we currently never have to re-open an edits file for append, since when we start, we always save a "fresh" checkpoint image and empty "edits" if there were any edits to apply. One advantage of the new design is that we no longer have to do this - we just bump the edits log number to the next one in sequence - ie we roll on startup if the latest edit log is non-empty. bq. Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN The patch currently does exactly that - we just don't write down the special "roll" entry in any file streams. We certainly could, though, if it's useful to know that a file was completely written. bq. Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can In the patch, the spooling has just become a bit more of a general case. Rather than spooling to a special file, we simply ask the primary NN to roll, and then wait for the roll to happen. While waiting for the roll, we continue to apply edits. One we get the special "roll" record, we stop applying edits and make a checkpoint at that point. Once the checkpoint completes, we "converge" by continuing to read forward in the sequence of log files until we hit the end and are back "in sync" bq. A backup NN should not ask for a roll. The primary should roll when it feels it is necessary. I think the simplest will be if anyone may ask for a roll - ie CN, BN, or NN. The NN of course is the one that actually makes the decision, but the decision may be in response to a request from one of the other nodes. I think this ability is useful not just for CN,BN, and NN, but also for example in backup scripts - you may ask the NN to roll right before making a tarball of the edits directory, and thus be sure that you get all of the current edits in "finalized" files. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928682#action_12928682 ] Sanjay Radia commented on HDFS-1073: >.. not understand the rational for "I'm quitting!" record. Why should NN care >whether last record was lost or not, just keep going with what it has. The quitting record basically shows that the NN did a shutdown and did not die. This is useful to know. The NN will still continue to keep going as before. If we were to add a similar "rolled" transaction at the end of every roll then we could avoid the edits_100-100 since it will become edits_100-101. Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928672#action_12928672 ] Sanjay Radia commented on HDFS-1073: >... but then you get the problem of synchronizing the start of a checkpoint >and the edits roll event. Otherwise checkpoints may become way behind the >current namespace state. I guess I am missing this. We should avoid the synchronization that has been there in the original design of the secondary NN. The BN can checkpoint whenever it feels that the set of rolled edits since previous checkpoint is large enough. It may be simpler to do it on every roll if we have configured the NN to roll say every 10K transactions. Perhaps what I am proposing works for the checkpointer but not for the BN because of some property of the BN that I am missing. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928506#action_12928506 ] Konstantin Shvachko commented on HDFS-1073: --- Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can. BN makes a checkpoint from its memory state, which differs it from SNN and CN. While it does it, the namespace should be locked (for modifications), so the edits go into journal spool, which is reapplied to memory after the checkpoint is finished. Please see the design doc in HADOOP-4539. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928488#action_12928488 ] Konstantin Shvachko commented on HDFS-1073: --- - Sanjay, Todd means that when a checkpoint starts it triggers rollEdits(), which is the cut off point for the new checkpoint. The checkpoint of course can use the latest rolled edits instead, but then you get the problem of synchronizing the start of a checkpoint and the edits roll event. Otherwise checkpoints may become way behind the current namespace state. - I agree with Rob that edits_100-100 should not be a special case to avoid. In practice we will not see it, but if it happens the system should just absorb it. Todd correctly points out that if the system is idle for a very long time NN may try to create edits_100-100 the second time, but this could be just avoided based on name collision. - I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928283#action_12928283 ] Sanjay Radia commented on HDFS-1073: > 4) No more edits, but a new BN starts up, and thus asks for another roll. A backup NN should not ask for a roll. The primary should roll when it feels it is necessary. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924135#action_12924135 ] Todd Lipcon commented on HDFS-1073: --- Hi Rob. You raise a good point. I think we'd have to do something where shutting down the NN with 100-inprogress would result in 100-100, and the NN would reopen that file as 100-inprogress upon restart. This seems messy to me - I would love to keep an invariant that once a file is "finalized" it is never renamed or changes contents. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924103#action_12924103 ] Robert Chansler commented on HDFS-1073: --- Is the file 100-100 _forbidden_? What if the service is stopped when the most recent file has zero records? (I'd always write a "I'm quitting" record, otherwise you can never know if you have lost the last edits.) And what if there are files 100-200 and 100-300? Rather than different special cases, why not make the general case just work? Roll means roll regardless, and starting up finds the latest image and _any_ consistent sequence of edits that -starts with- includes the very next transaction, reporting whether the last available edit record is "I'm quitting!". And catching up with Sanjay's comment about tx ids in every record, it would seem that the principal benefits are really obtained only if the tx id is assigned to requests as they are _received in sequence_. Just doing {{log.write(id++)}} doesn't offer much real protection. If there is a tx id per record, would it make sense for the actual bits be the record check sum+id? Years ago we discussed having record check sums, but it never became a priority. (In file N-M, I might have expected that the first record, if any, has tx id N, not N+1.) > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923923#action_12923923 ] Suresh Srinivas commented on HDFS-1073: --- +1 for option A. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923313#action_12923313 ] dhruba borthakur commented on HDFS-1073: I prefer option (a). > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923276#action_12923276 ] Todd Lipcon commented on HDFS-1073: --- Converted HDFS-259 as a subtask. While we are cleaning up and redoing this section of the code, it will make things clearer to get rid of the code pertaining to ancient layout versions. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923165#action_12923165 ] Todd Lipcon commented on HDFS-1073: --- I'm starting to think about how to convert the current code over to the txid based numbering, came upon a design point I wanted to discuss here: What should we do about the case when the edits should be rolled, but there have been no transactions since the last roll? For example, consider the following sequence: 1) Start up a fresh NN. We are writing to file edits_0-inprogress 2) Perform 100 edits- now current txid is 100. 3) Perform a roll. This renames edits_0-inprogress to edits_0-100 and opens edits_100-inprogress 4) No more edits, but a new BN starts up, and thus asks for another roll. Thus we would like to create edits_100-100, a file with no edits, which is a little bit strange, and will cause issues the next time we roll (we'll end up with edits_100-100 and also edits_100-200 for example) It seems the options are: a) if asked to roll when we have not written any transactions to our current log, it is a no-op b) whenever we roll, we append a special "trailer" transaction. Thus every log has at least 1 edit in it. I don't really like this, since it means that after a crash, we'll have a log without a trailer, which will add edge cases to worry about. I'm leaning towards A. Am I missing another good solution? > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs-1073.txt, hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918102#action_12918102 ] Sanjay Radia commented on HDFS-1073: > So are you suggesting that each edit will include a header with the > transaction ID in it? .. Actually I was, but your suggestion works. However, having a txid in each record helps sanity checks and debugging. There have been cases where we have got the transactions reversed in the past. The main cost is reading the extra 8 bytes and de-serializing the number. (BTW zookeeper does put in the txid in each editlog record). > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918088#action_12918088 ] Sanjay Radia commented on HDFS-1073: >> In order to do an offline fsck one can needs to dump the block map; ... >Sorry, can you elaborate a little bit here? ... This has nothing to do with Backup namenode. Currently the fsck is implemented inside the NN. We would like do this offiline. So one could do a dump of the block map and at the start of the dump record the transaction id. I believe that with this one would not have lock the FSNamspace. The above is just one use case of the transaction id. There are others. For example, during failover the transaction id would be useful for determining on one has the latest edits. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916528#action_12916528 ] Ivan Kelly commented on HDFS-1073: -- I've been working on Todds code to bring it up to date with trunk. Currently I've got it as far as passing all the smoke tests. http://github.com/ivankelly/hadoop-hdfs/tree/hdfs-1073 It would be good if we could get a consensus on which numbering approach to take, so I can attack that problem before getting the tests up to 100%. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915783#action_12915783 ] Ivan Kelly commented on HDFS-1073: -- {quote} while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly. {quote} The case of all edit logs being _inprogress during a crash should be a very rare case. Is it really an issue if it takes a little longer to determine which has the most transactions if it's only going to incurred after a bad crash? {quote} I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages. {quote} FSEditLog already has a transaction concept which could be modified for this. Currently its not stored anywhere, but is used for logSync. It starts and 0 at NN startup and increases monotonically, reseting the next time NN starts. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900499#action_12900499 ] Todd Lipcon commented on HDFS-1073: --- Hey Sanjay, Thanks for reviving this. The notes you wrote above seem accurate. Couple of questions: bq. while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly. bq. Main disadvantage is that the editlogs will be little bigger. So are you suggesting that each edit will include a header with the transaction ID in it? Isn't this redundant if the header of the whole edit file has the starting txid -- ie is there ever a case where we'd skip a txid? bq. In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently. Sorry, can you elaborate a little bit here? In order to get a consistent dump of the block map don't we need to take the FSN lock and thus stall all operations? Is the idea that the BackupNode would do the blockmap dump offline since it can hold a lock for some time without stalling clients? If that's the case, what's the purpose of the offline nature of the fsck instead of just having BackupNode allow fsck to point directly at it and access memory under the same lock? Mahadev said: bq. Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging? I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900396#action_12900396 ] Mahadev konar commented on HDFS-1073: - sanjay, I dont understand the disadvantage you are quoting here. As far as I see being able to seek to a specific transaction quickly (which the snapshot log with txnid enable u to do) is a good thing! Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging? As far as I read Todd's description, using transaction ids and naming the edit logs and image using the transaction id's enable you to all those recoveries stated in Todd's document! > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900162#action_12900162 ] Sanjay Radia commented on HDFS-1073: Here is what I remember from our meeting in April. Todd, you took notes, please add anything I missed. There were 2 issues under contention: # Add transaction Id to the edit logs # Name the edit logs and image logs using the transaction id. These are orthogonal to each other. Main advantage of adding transaction id to edit logs has following advantages (only the first advantage was discussed at the meeting, I am adding the other two) * when a snapshot of a NN state is taken one can record the Tid for the snapshot - this is useful for knowwing the diff between two snapshots etc. * while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. * In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently. Main disadvantage is that the editlogs will be little bigger. Main disadvantage of Naming the edit logs using transaction ids is that the the edit logs reader needs to be able to seek forward to a specific transaction id. The advantages have been discussed above; I will summarize in the separate comment. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > Attachments: hdfs1073.pdf > > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856706#action_12856706 ] Mahadev konar commented on HDFS-1073: - I havent been able to read through all the comments, so pardon me if my comments do not make much sense. Regarding using sequence numbers for naming edits and snapshots versus using transactions ids, I would like to put forth a few reasons it has been really useful for us in ZooKeeper to use transaction ids: - checks for missing transactions. With file names as edits_txid and snashot_txid its very easy to check if there was any missing transaction. - debugging and finding the transaction you need is very easy. Lets say you want to dump the transaction logs starting from transaction X. With the above scheme it becomes very easy to search for the right transaction logs to start dumping from wherein no files need to be opened to check what transactions they might have. hopefully this helps. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856703#action_12856703 ] Todd Lipcon commented on HDFS-1073: --- Hi Sanjay/Konstantin, Thanks for the comments and questions. I didn't originally anticipate writing a design doc inline, but you know, fingers started typing and a few pages later it was a very long JIRA comment :) I'll do another rev to address your questions as well as flesh out the BN and Upgrade bits, and upload it as an attachment here some time tomorrow. > Simpler model for Namenode's fs Image and edit Logs > > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sanjay Radia >Assignee: Todd Lipcon > > The naming and handling of NN's fsImage and edit logs can be significantly > improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira