[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file which has snapshots
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16304855#comment-16304855 ] Xiao Chen commented on HDFS-12369: -- As it turned out, this issue may have some morphs on the exact symptom, depending on what the file status is at the time of recovery. At the core of the issue, the lease recovery of a deleted file should not edit log anything, which is what this jira fixed. {noformat} 2017-12-27 15:38:57,360 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp [length=0, inodeId=0, path=/filename, replication=3, mtime=1514301485930, atime=1514297585263, blockSize=268435456, blocks=[blk_1863506432_791165194, blk_1863506631_791165393, blk_1863506826_791165588], permissions=hdfs:superuser:rw-r--r--, aclEntries=null, clientName=, clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, txid=10577364851] java.io.IOException: Mismatched block IDs or generation stamps, attempting to replace block blk_1863518793_791177559 with blk_1863506432_791165194 as block # 0/3 of /filename at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:942) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:434) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) {noformat} This happened when a file with the same name is created (1st time) -> deleted -> created by a 2nd user -> lease recovered (of the 1st creation) -> closed by the 2nd user, causing 2 close ops in the edits. > Edit log corruption due to hard lease recovery of not-closed file which has > snapshots > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.9.0, 3.0.0-beta1, 2.8.3 > > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.03.patch, HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) >
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file which has snapshots
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157876#comment-16157876 ] Hudson commented on HDFS-12369: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12813 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/12813/]) HDFS-12369. Edit log corruption due to hard lease recovery of not-closed (xiao: rev 52b894db33bc68b46eec5cdf2735dfcf4030853a) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java > Edit log corruption due to hard lease recovery of not-closed file which has > snapshots > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.9.0, 3.0.0-beta1, 2.8.3 > > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.03.patch, HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file which has snapshots
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157349#comment-16157349 ] Yongjun Zhang commented on HDFS-12369: -- +1 on rev3. Thanks Xiao. > Edit log corruption due to hard lease recovery of not-closed file which has > snapshots > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.03.patch, HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file which has snapshots
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155041#comment-16155041 ] Hadoop QA commented on HDFS-12369: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 38s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}117m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.TestDatanodeConfig | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12369 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12885539/HDFS-12369.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux fba8e4ae3c99 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1f3bc63 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21016/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21016/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21016/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Edit log corruption due to hard lease recovery of not-closed file which has > snapshots > - > > Key: HDFS-12369 > URL:
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file which has snapshots
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154913#comment-16154913 ] Xiao Chen commented on HDFS-12369: -- Thanks for the review Yongjun. bq. does this issue only occur when the file has a snapshot? Good question! Yes, because only with snapshots the delete will go the path of {{FSDirDeleteOp#unprotectedDelete}} -> {{INodeFile#cleanSubtree}}, which eventually ends up not {{clearFile}} in : {code:title=FileWithSnapshotFeature.java} public void collectBlocksAndClear( INode.ReclaimContext reclaimContext, final INodeFile file) { // check if everything is deleted. if (isCurrentFileDeleted() && getDiffs().asList().isEmpty()) { file.clearFile(reclaimContext); return; } {code} Added a few comments in test about this, and also updated the jira title. Good catch for the extra ';'. Attach patch 3 to address that, previous checkstyle, and added the {{addBlock}} in test. > Edit log corruption due to hard lease recovery of not-closed file which has > snapshots > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.03.patch, HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154668#comment-16154668 ] Yongjun Zhang commented on HDFS-12369: -- Hi [~xiaochen], Thanks for working on this issue. The change looks good to me. One question, does this issue only occur when the file has a snapshot? The test indicates that. If it also occurs when there is no snapshot, would be nice to have a test for that. BTW, Noticed an extra ";" in {code} final INodeFile lastINode = iip.getLastINode().asFile();; {code} > Edit log corruption due to hard lease recovery of not-closed file > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146552#comment-16146552 ] Xiao Chen commented on HDFS-12369: -- Following the above, we also brainstormed about other scenarios that a non-closed stream would cause issues. We only came up with {{addBlock}}, since the stream holder could keep writing to that stream. After further inspecting code, though, {{addBlock}} is also safe, since it also requires a path resolution which would fail when the file is deleted. I can incorporate the above to the unit test in the next rev, with any review comments to come. > Edit log corruption due to hard lease recovery of not-closed file > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12369.01.patch, HDFS-12369.02.patch, > HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146455#comment-16146455 ] Hadoop QA commented on HDFS-12369: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 5m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 208 unchanged - 0 fixed = 210 total (was 208) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 32s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}138m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 | | | hadoop.hdfs.TestListFilesInFileContext | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.TestReconstructStripedFile | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-12369 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12884362/HDFS-12369.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2deedb73a66e 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144787#comment-16144787 ] Hadoop QA commented on HDFS-12369: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 184 unchanged - 0 fixed = 185 total (was 184) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 13s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}148m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | | | hadoop.hdfs.tools.TestDFSAdminWithHA | | | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | |
[jira] [Commented] (HDFS-12369) Edit log corruption due to hard lease recovery of not-closed file
[ https://issues.apache.org/jira/browse/HDFS-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144529#comment-16144529 ] Xiao Chen commented on HDFS-12369: -- Attaching a unit test that sorta reproduces this - it ends up with an NPE when loading {{ReassignLeaseOp}}, instead of the FNFE when loading the {{CloseOp}}. I'm still looking into this, but wanted to post here for early discussions. > Edit log corruption due to hard lease recovery of not-closed file > - > > Key: HDFS-12369 > URL: https://issues.apache.org/jira/browse/HDFS-12369 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12369.test.patch > > > HDFS-6257 and HDFS-7707 worked hard to prevent corruption from combinations > of client operations. > Recently, we have observed NN not able to start with the following exception: > {noformat} > 2017-08-17 14:32:18,418 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.FileNotFoundException: File does not exist: > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:429) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:897) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:750) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:318) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1125) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:789) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} > Quoting a nicely analysed edits: > {quote} > In the edits logged about 1 hour later, we see this failing OP_CLOSE. The > sequence in the edits shows the file going through: > OPEN > ADD_BLOCK > CLOSE > ADD_BLOCK # perhaps this was an append > DELETE > (about 1 hour later) CLOSE > It is interesting that there was no CLOSE logged before the delete. > {quote} > Grepping that file name, it turns out the close was triggered by > {{LeaseManager}}, when the lease reaches hard limit. > {noformat} > 2017-08-16 15:05:45,927 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-1997177597_28, pending > creates: 75], > src=/home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M > 2017-08-16 15:05:45,927 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* > internalReleaseLease: All existing blocks are COMPLETE, lease removed, file > /home/Events/CancellationSurvey_MySQL/2015/12/31/.part-0.9nlJ3M closed. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org