[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245945#comment-13245945 ] Colin Patrick McCabe commented on HDFS-1378: ran TestCheckpoint, TestEditLog, TestEditLogLoading, TestNameNodeMXBean, TestSaveNamespace, TestSecurityTokenEditLog, TestStorageDirectoryFailure, TestStorageRestore Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Colin Patrick McCabe Fix For: 0.23.0 Attachments: HDFS-1378-b1.002.patch, HDFS-1378-b1.003.patch, HDFS-1378-b1.004.patch, hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037368#comment-13037368 ] Hudson commented on HDFS-1378: -- Integrated in Hadoop-Hdfs-trunk #673 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/]) Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034573#comment-13034573 ] Hudson commented on HDFS-1378: -- Integrated in Hadoop-Hdfs-trunk-Commit #658 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/658/]) Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031596#comment-13031596 ] Hadoop QA commented on HDFS-1378: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478776/hdfs-1378.1.patch against trunk revision 1101753. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//console This message is automatically generated. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031837#comment-13031837 ] Aaron T. Myers commented on HDFS-1378: -- All of the test failures except for {{TestEditLogFileOutputStream}} are known to be failing on trunk. The {{TestEditLogFileOutputStream}} failure appears to be transient. It passes on my box, and this is the message it failed with in the Jenkins run: {{noformat}} java.net.BindException: Port in use: 0.0.0.0:50070 {{noformat}} Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032161#comment-13032161 ] Hadoop QA commented on HDFS-1378: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478890/hdfs-1378.2.txt against trunk revision 1102094. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//console This message is automatically generated. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032192#comment-13032192 ] Aaron T. Myers commented on HDFS-1378: -- Updated patch looks good to me. Thanks for catching that, Todd. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031053#comment-13031053 ] Aaron T. Myers commented on HDFS-1378: -- I should've mentioned: this patch is for trunk. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031397#comment-13031397 ] Todd Lipcon commented on HDFS-1378: --- Oops. I missed the test that you so thoroughly included (was looking only at changed files and missed the new one). +1 pending Hudson results. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031444#comment-13031444 ] Hadoop QA commented on HDFS-1378: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478673/hdfs-1378.0.patch against trunk revision 1101343. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestEditLog org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/474//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/474//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/474//console This message is automatically generated. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016170#comment-13016170 ] Aaron T. Myers commented on HDFS-1378: -- Patch looks pretty solid, Todd, and very helpful. One comment: There are large classes of edits log corruptions which will result in some exception which is not an IOE being thrown. But, this debugging info is only printed in the event an IOE is thrown. I've twice now had to change this code to catch NPE and recompile to get it to print this info. Ideally I think we'd change things so that this stuff is in a {{catch (Throwable t)}} block, with the actual exception being re-thrown after printing. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1378-branch20.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira