[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303678#comment-14303678 ] Brahma Reddy Battula commented on HDFS-7414: This defect is duplicate to HDFS-7707 Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.6.0, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246436#comment-14246436 ] Rakesh R commented on HDFS-7414: Yeah [~brahmareddy], edit log viewer shows the same. I'm suspecting the could be chances of the following: Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246440#comment-14246440 ] Rakesh R commented on HDFS-7414: Yeah Brahma, edit log viewer shows the same. I'm suspecting the chances of occurring the below operations concurrently. Let me try to reproduce the same. operation-1) internal lease releases occurred and initialize block recovery. This will add the OP_CLOSE entry. operation-2) client deleted the file. This will add OP_DELETE entry Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246451#comment-14246451 ] Vinayakumar B commented on HDFS-7414: - Looks like you hit the HDFS-6825 which was also due to extra OP_CLOSE edits. Check whether you got the stacktrace mentioned in HDFS-6825 to confirm the same. Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246471#comment-14246471 ] Rakesh R commented on HDFS-7414: Hi Vinay, By seeing the [HDFS-6825 comment |https://issues.apache.org/jira/browse/HDFS-6825?focusedCommentId=14098682page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14098682], it looks like similar case. Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245908#comment-14245908 ] Brahma Reddy Battula commented on HDFS-7414: Got the cause: OP_CLOSE has been added after the OP_DELETE of the parent node, This causes the exception. So ignore the node file doesnt exists exception as the file is already deleted from the Namenode so it can suppress this exception. Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234088#comment-14234088 ] Brahma Reddy Battula commented on HDFS-7414: Hi [~yzhangal] Thanks for taking a look into this issue.. Yes, I have the HDFS-6527 fix..Please check the following for same... {code} private boolean isFileDeleted(INodeFile file) { return (this.dir.getInode(file.getId()) == null) || (file.getParent() == null) || ((file.isWithSnapshot()) (file.getFileWithSnapshotFeature().isCurrentFileDeleted())); } {code} Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234684#comment-14234684 ] Yongjun Zhang commented on HDFS-7414: - HI [~brahmareddy], There is another related jira to be aware of: HDFS-6825. If you could describe the steps that reproduce the issue, that will help. Thanks. Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233885#comment-14233885 ] Yongjun Zhang commented on HDFS-7414: - HI [~brahmareddy], Thanks for reporting this issue. The symptom looks similar to HDFS-6527, which is fixed in 2.5.0. You reported here 2.5.1, I assume you have the fix of HDFS-6527, but would you please confirm? Thanks. Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232582#comment-14232582 ] Brahma Reddy Battula commented on HDFS-7414: Following is my analysis,two things might cause edits missing,1)snapshot 2)delete dir while tailing edits inprogress.. *{color:blue}StandBy Namenode{color}* {noformat} 2014-11-27 23:57:05,835 | INFO | Edit log tailer | Triggering log roll on remote NameNode linux157/*.*.*.157:25000 | org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:269) 2014-11-27 23:57:06,129 | INFO | Edit log tailer | Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@419cf8c9 expecting start txid #1034147 | org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:802) 2014-11-27 23:57:06,130 | INFO | Edit log tailer | Start loading edits file https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster, https://linux156:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:132) 2014-11-27 23:57:06,130 | INFO | Edit log tailer | Fast-forwarding stream 'https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster, https://linux156:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster' to transaction ID 1034147 | org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:176) 2014-11-27 23:57:06,130 | INFO | Edit log tailer | Fast-forwarding stream 'https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster' to transaction ID 1034147 | org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:176) 2014-11-27 23:57:06,170 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994325_253503{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,170 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994326_253504{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,171 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994327_253505{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,171 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994328_253506{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,171 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994329_253507{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,173 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994330_253508{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,173 | INFO | Edit log tailer | BLOCK* addStoredBlock: blockMap updated: *.*.*.157:25009 is added to blk_1073994331_253509{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]} size 0 | org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393) 2014-11-27 23:57:06,175 | ERROR | Edit log tailer | Encountered
[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed
[ https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226141#comment-14226141 ] Brahma Reddy Battula commented on HDFS-7414: Seen following error from namenode log when it's initially came.. {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232 {noformat} and let me describe full scenario.. Deleting DIr (which is having 100 files) and running mapreduce job.. I think, while gettingLastInodeInpath it may get null, since I am deleting the dir where it edits will not synced.. {code} final INodesInPath iip = fsDir.getLastINodeInPath(path); final INodeFile file = INodeFile.valueOf(iip.getINode(0), path); {code} Anyone have Anythoughts on this..? Namenode got shutdown and can't recover where edit update might be missed - Key: HDFS-7414 URL: https://issues.apache.org/jira/browse/HDFS-7414 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1, 2.5.1 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Scenario: Was running mapreduce job. CPU usage crossed 190% for Datanode and machine became slow.. and seen the following exception .. *Did not get the exact root cause, but as cpu usage more edit log updation might be missed...Need dig to more...anyone have any thoughts.* {noformat} 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation CloseOp [length=0, inodeId=0, path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025, replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=162982] | org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459) 2014-11-20 05:01:18,654 | WARN | main | Encountered exception loading fsimage | org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642) java.io.FileNotFoundException: File does not exist: /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409) at