[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2015-02-03 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303678#comment-14303678
 ] 

Brahma Reddy Battula commented on HDFS-7414:


This defect is  duplicate to HDFS-7707

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.6.0, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
  

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-15 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246436#comment-14246436
 ] 

Rakesh R commented on HDFS-7414:


Yeah [~brahmareddy], edit log viewer shows the same. I'm suspecting the could 
be chances of the following:

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-15 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246440#comment-14246440
 ] 

Rakesh R commented on HDFS-7414:


Yeah Brahma, edit log viewer shows the same. I'm suspecting the chances of 
occurring the below operations concurrently. Let me try to reproduce the same.

operation-1) internal lease releases occurred and initialize block recovery. 
This will add the OP_CLOSE entry.
operation-2) client deleted the file. This will add OP_DELETE entry

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-15 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246451#comment-14246451
 ] 

Vinayakumar B commented on HDFS-7414:
-

Looks like you hit the HDFS-6825 which was also due to extra OP_CLOSE edits. 
Check whether you got the stacktrace mentioned in HDFS-6825 to confirm the same.

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-15 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246471#comment-14246471
 ] 

Rakesh R commented on HDFS-7414:


Hi Vinay, By seeing the [HDFS-6825 comment 
|https://issues.apache.org/jira/browse/HDFS-6825?focusedCommentId=14098682page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14098682],
 it looks like similar case.

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-14 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245908#comment-14245908
 ] 

Brahma Reddy Battula commented on HDFS-7414:


Got the cause:

OP_CLOSE has been added after the OP_DELETE of the parent node, This causes the 
exception. 

So ignore the node file doesnt exists exception as the file is already deleted 
from the Namenode so it can suppress this exception.

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234088#comment-14234088
 ] 

Brahma Reddy Battula commented on HDFS-7414:


Hi [~yzhangal]

Thanks for taking a look into this issue..

Yes, I have the HDFS-6527 fix..Please check the following for same...
{code}
private boolean isFileDeleted(INodeFile file)
  {
return (this.dir.getInode(file.getId()) == null) || (file.getParent() == 
null) || ((file.isWithSnapshot())  
(file.getFileWithSnapshotFeature().isCurrentFileDeleted()));
  }
{code}

 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-04 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234684#comment-14234684
 ] 

Yongjun Zhang commented on HDFS-7414:
-

HI [~brahmareddy],

There is another related jira to be aware of: HDFS-6825.

If you could describe the steps that reproduce the issue, that will help.

Thanks.



 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-03 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233885#comment-14233885
 ] 

Yongjun Zhang commented on HDFS-7414:
-

HI [~brahmareddy],

Thanks for reporting this issue. The symptom looks similar to HDFS-6527, which 
is fixed in 2.5.0. You reported here 2.5.1, I assume you have the fix of 
HDFS-6527, but would you please confirm?  Thanks.



 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-12-02 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232582#comment-14232582
 ] 

Brahma Reddy Battula commented on HDFS-7414:


 Following is my analysis,two things might cause edits missing,1)snapshot 
2)delete dir while tailing edits inprogress..

 *{color:blue}StandBy Namenode{color}* 

{noformat}
2014-11-27 23:57:05,835 | INFO  | Edit log tailer | Triggering log roll on 
remote NameNode linux157/*.*.*.157:25000 | 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:269)
2014-11-27 23:57:06,129 | INFO  | Edit log tailer | Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@419cf8c9 
expecting start txid #1034147 | 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:802)
2014-11-27 23:57:06,130 | INFO  | Edit log tailer | Start loading edits file 
https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster,
 
https://linux156:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster
 | 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:132)
2014-11-27 23:57:06,130 | INFO  | Edit log tailer | Fast-forwarding stream 
'https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster,
 
https://linux156:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster'
 to transaction ID 1034147 | 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:176)
2014-11-27 23:57:06,130 | INFO  | Edit log tailer | Fast-forwarding stream 
'https://linux158:8481/getJournal?jid=haclustersegmentTxId=1034147storageInfo=-56%3A295084204%3A0%3Amyhacluster'
 to transaction ID 1034147 | 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:176)
2014-11-27 23:57:06,170 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994325_253503{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,170 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994326_253504{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,171 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994327_253505{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,171 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994328_253506{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,171 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994329_253507{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,173 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994330_253508{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,173 | INFO  | Edit log tailer | BLOCK* addStoredBlock: 
blockMap updated: *.*.*.157:25009 is added to 
blk_1073994331_253509{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-70872ee7-101c-4c50-a7d2-8876bf23271b:NORMAL|RBW]]}
 size 0 | 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2393)
2014-11-27 23:57:06,175 | ERROR | Edit log tailer | Encountered 

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-11-26 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226141#comment-14226141
 ] 

Brahma Reddy Battula commented on HDFS-7414:


Seen following error from namenode log when it's initially came..

{noformat}
2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
CloseOp [length=0, inodeId=0, 
path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
 replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
clientMachine=, opCode=OP_CLOSE, txid=162982] | 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232
{noformat}

and let me describe full scenario..

Deleting DIr (which is having 100 files) and running mapreduce job..

I think, while gettingLastInodeInpath it may get null, since I am deleting the 
dir where it edits will not synced..

{code}
  final INodesInPath iip = fsDir.getLastINodeInPath(path);
  final INodeFile file = INodeFile.valueOf(iip.getINode(0), path);
{code}

Anyone have Anythoughts on this..?




 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at