[ https://issues.apache.org/jira/browse/HADOOP-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-2691: ------------------------------------- Attachment: datanodesBad3.patch Upload again so that this is the latest uploaded patch file. Otherwise hudson will pick up the log file as a patch file. > Some junit tests fail with the exception: All datanodes are bad. Aborting... > ---------------------------------------------------------------------------- > > Key: HADOOP-2691 > URL: https://issues.apache.org/jira/browse/HADOOP-2691 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Affects Versions: 0.15.2 > Reporter: Hairong Kuang > Assignee: dhruba borthakur > Fix For: 0.16.0 > > Attachments: build.log, datanodesBad.patch, datanodesBad1.log, > datanodesBad1.patch, datanodesBad2.patch, datanodesBad3.patch, > datanotesBad2.log, TestTableMapReduce-patch.txt > > > Some junit tests fail with the following exception: > java.io.IOException: All datanodes are bad. Aborting... > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:1831) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:1479) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1571) > The log contains the following message: > 2008-01-19 23:00:25,557 INFO dfs.StateChange > (FSNamesystem.java:allocateBlock(1274)) - BLOCK* NameSystem.allocateBlock: > /srcdat/three/3189919341591612220. blk_6989304691537873255 > 2008-01-19 23:00:25,559 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(1982)) - pipeline = 127.0.0.1:40678 > 2008-01-19 23:00:25,559 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(1982)) - pipeline = 127.0.0.1:40680 > 2008-01-19 23:00:25,559 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(1985)) - Connecting to 127.0.0.1:40678 > 2008-01-19 23:00:25,570 INFO dfs.DataNode (DataNode.java:writeBlock(1084)) - > Receiving block blk_6989304691537873255 from /127.0.0.1 > 2008-01-19 23:00:25,572 INFO dfs.DataNode (DataNode.java:writeBlock(1084)) - > Receiving block blk_6989304691537873255 from /127.0.0.1 > 2008-01-19 23:00:25,573 INFO dfs.DataNode (DataNode.java:writeBlock(1169)) - > Datanode 0 forwarding connect ack to upstream firstbadlink is > 2008-01-19 23:00:25,573 INFO dfs.DataNode (DataNode.java:writeBlock(1150)) - > Datanode 1 got response for connect ack from downstream datanode with > firstbadlink as > 2008-01-19 23:00:25,573 INFO dfs.DataNode (DataNode.java:writeBlock(1169)) - > Datanode 1 forwarding connect ack to upstream firstbadlink is > 2008-01-19 23:00:25,574 INFO dfs.DataNode > (DataNode.java:lastDataNodeRun(1802)) - Received block > blk_6989304691537873255 of size 34 from /127.0.0.1 > 2008-01-19 23:00:25,575 INFO dfs.DataNode > (DataNode.java:lastDataNodeRun(1819)) - PacketResponder 0 for block > blk_6989304691537873255 terminating > 2008-01-19 23:00:25,575 INFO dfs.StateChange > (FSNamesystem.java:addStoredBlock(2467)) - BLOCK* NameSystem.addStoredBlock: > blockMap updated: 127.0.0.1:40680 is added to blk_6989304691537873255 size 34 > 2008-01-19 23:00:25,575 INFO dfs.DataNode (DataNode.java:close(2013)) - > BlockReceiver for block blk_6989304691537873255 waiting for last write to > drain. > 2008-01-19 23:01:31,577 WARN fs.DFSClient (DFSClient.java:run(1764)) - > DFSOutputStream ResponseProcessor exception for block > blk_6989304691537873255java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.DataInputStream.readFully(DataInputStream.java:176) > at java.io.DataInputStream.readLong(DataInputStream.java:380) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:1726) > 2008-01-19 23:01:31,578 INFO fs.DFSClient (DFSClient.java:run(1653)) - > Closing old block blk_6989304691537873255 > 2008-01-19 23:01:31,579 WARN fs.DFSClient > (DFSClient.java:processDatanodeError(1803)) - Error Recovery for block > blk_6989304691537873255 bad datanode[0] 127.0.0.1:40678 > 2008-01-19 23:01:31,580 WARN fs.DFSClient > (DFSClient.java:processDatanodeError(1836)) - Error Recovery for block > blk_6989304691537873255 bad datanode 127.0.0.1:40678 > 2008-01-19 23:01:31,580 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(1982)) - pipeline = 127.0.0.1:40680 > 2008-01-19 23:01:31,580 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(1985)) - Connecting to 127.0.0.1:40680 > 2008-01-19 23:01:31,582 INFO dfs.DataNode (DataNode.java:writeBlock(1084)) - > Receiving block blk_6989304691537873255 from /127.0.0.1 > 2008-01-19 23:01:31,584 INFO dfs.DataNode (DataNode.java:writeBlock(1196)) - > writeBlock blk_6989304691537873255 received exception java.io.IOException: > Reopen Block blk_6989304691537873255 is valid, and cannot be written to. > 2008-01-19 23:01:31,584 ERROR dfs.DataNode (DataNode.java:run(997)) - > 127.0.0.1:40680:DataXceiver: java.io.IOException: Reopen Block > blk_6989304691537873255 is valid, and cannot be written to. > at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:613) > at > org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:1996) > at > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1109) > at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:982) > at java.lang.Thread.run(Thread.java:595) > 2008-01-19 23:01:31,585 INFO fs.DFSClient > (DFSClient.java:createBlockOutputStream(2024)) - Exception in > createBlockOutputStream java.io.EOFException > The log shows that blk_6989304691537873255 was successfully written to two > datanodes. But dfsclient timed out waiting for a response from the first > datanode. It tried to recover from the failure by resending the data to the > second datanode. However, the recovery failed because the second datanode > threw an IOException when it detected that it already had the block. It would > be nice that the second datanode does not throw an exception for a finalized > block during a recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.