[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065756#comment-13065756 ] Tsz Wo (Nicholas), SZE commented on HDFS-1149: -- DFSClientAdapter should be put under test; filed HDFS-2153. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch, hdfs-1149.2.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051043#comment-13051043 ] Hudson commented on HDFS-1149: -- Integrated in Hadoop-Hdfs-trunk #699 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/]) Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch, hdfs-1149.2.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050125#comment-13050125 ] Hudson commented on HDFS-1149: -- Integrated in Hadoop-Hdfs-trunk-Commit #746 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/]) Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch, hdfs-1149.2.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044999#comment-13044999 ] Hadoop QA commented on HDFS-1149: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481596/editsStored against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/721//console This message is automatically generated. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045048#comment-13045048 ] Hadoop QA commented on HDFS-1149: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481598/hdfs-1149.1.patch against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/722//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/722//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/722//console This message is automatically generated. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045052#comment-13045052 ] Todd Lipcon commented on HDFS-1149: --- It seems to me that the new condition for safe mode check in internalReleaseLease should actually be an assert, right? The two call paths into internalReleaseLease are: - FSNamesystem.startFileInternal, which checks safeMode at the top - LeaseManager.checkLeases(), which is called from LeaseManager.Monitor inside a check against safemode so, any call of internalReleaseLease not in safe mode would be a real bug, right? Otherwise looks good. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045139#comment-13045139 ] Hadoop QA commented on HDFS-1149: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481613/hdfs-1149.2.patch against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/724//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/724//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/724//console This message is automatically generated. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch, hdfs-1149.2.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045143#comment-13045143 ] Aaron T. Myers commented on HDFS-1149: -- TestBalancer just passed for me on my local box. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: editsStored, hdfs-1149.0.patch, hdfs-1149.1.patch, hdfs-1149.1.patch, hdfs-1149.2.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042572#comment-13042572 ] Hadoop QA commented on HDFS-1149: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481177/hdfs-1149.0.patch against trunk revision 1130339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.TestHDFSTrash org.apache.hadoop.hdfs.TestHFlush org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/680//console This message is automatically generated. Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1149.0.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042578#comment-13042578 ] Todd Lipcon commented on HDFS-1149: --- A few nits: - for DataNode.setHeartbeatsEnabled, I think it would be better to make it package-private, and then bounce through the DataNodeAdapter class to get at it. I also think it would be clearer if we inverted its meaning and renamed it to {{heartbeatsDisabledForTests}} - that way when reading the code later it will be clear that this is always false in normal operation. - Same goes for all of the new public members in LeaseManager/Lease -- I think you can just move the getLeaseByPath function into NameNodeAdapter, then it can all stay package-protected, right? - In the test case, I think it's better to call {{stm.hflush()}} after the writer has lost its lease -- this is a DN-only operation, which means that it's verifying that the lease recovery has gone all the way through, not just a NN state change. The fact that you check isUnderConstruction should already do that as well, but just a double-check. Then you can close the stream as well and check for the same exception. - I think the new NAMENODE_LEASE_MANAGER_SLEEP_TIME is probably better named NAMENODE_LEASE_RECHECK_INTERVAL (more consistent with other variables like {{heartbeatRecheckInterval}} and {{replicationRecheckInterval}}) Other concern: - Does this interact correctly with lease maintenance on rename/delete? I think so... but it would be good to add the following tests: Test A: 1) client creates file /dir_a/file and leaves it open 2) client renames /dir_a to /dir_b (this calls LeaseManager.changeLease) 3) client dies, so lease recovery happens 4) NN reassigns lease to NN_Recovery 5) NN restarts and loads edits: NN_Recovery should own the lease on the new location of the file [ this tests that on edit log replay, the lease is properly tracked to the new name of the file ] Test B: 1) client creates file /file and leaves it open 2) client deletes file /file 3) client dies, so lease recovery happens 4) NN reassigns lease to NN_Recovery 5) NN restarts and loads edits: no NPEs or anything I'm also wondering if we have an issue with regards to safeMode. In theory we should never write anything to the edit log while in safemode, but I don't see safemode checks in internalReleaseLease. This is similar to the bugs seen in HDFS-988 if you want some background Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1149.0.patch During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1149) Lease reassignment is not persisted to edit log
[ https://issues.apache.org/jira/browse/HDFS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870850#action_12870850 ] sam rash commented on HDFS-1149: Hey Todd, Can you please specify the sequence of events that occur and the ending error state? Lease reassignment is not persisted to edit log --- Key: HDFS-1149 URL: https://issues.apache.org/jira/browse/HDFS-1149 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Todd Lipcon During lease recovery, the lease gets reassigned to a special NN holder. This is not currently persisted to the edit log, which means that after an NN restart, the original leaseholder could end up allocating more blocks or completing a file that has already started recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.