Re: Bad connect ack with firstBadLink
Hi, Increasing the open file limit solved the issue. Thank you. On Fri, May 4, 2012 at 9:39 PM, Mapred Learn mapred.le...@gmail.com wrote: Check your number of blocks in the cluster. This also indicates that your datanodes are more full than they should be. Try deleting unnecessary blocks. On Fri, May 4, 2012 at 7:40 AM, Mohit Anchlia mohitanch...@gmail.com wrote: Please see: http://hbase.apache.org/book.html#dfs.datanode.max.xcievers On Fri, May 4, 2012 at 5:46 AM, madhu phatak phatak@gmail.com wrote: Hi, We are running a three node cluster . From two days whenever we copy file to hdfs , it is throwing java.IO.Exception Bad connect ack with firstBadLink . I searched in net, but not able to resolve the issue. The following is the stack trace from datanode log 2012-05-04 18:08:08,868 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-7520371350112346377_50118 received exception java.net.SocketException: Connection reset 2012-05-04 18:08:08,869 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 172.23.208.17:50010, storageID=DS-1340171424-172.23.208.17-50010-1334672673051, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:662) It will be great if some one can point to the direction how to solve this problem. -- https://github.com/zinnia-phatak-dev/Nectar -- https://github.com/zinnia-phatak-dev/Nectar
Bad connect ack with firstBadLink
Hi, We are running a three node cluster . From two days whenever we copy file to hdfs , it is throwing java.IO.Exception Bad connect ack with firstBadLink . I searched in net, but not able to resolve the issue. The following is the stack trace from datanode log 2012-05-04 18:08:08,868 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-7520371350112346377_50118 received exception java.net.SocketException: Connection reset 2012-05-04 18:08:08,869 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 172.23.208.17:50010, storageID=DS-1340171424-172.23.208.17-50010-1334672673051, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:662) It will be great if some one can point to the direction how to solve this problem. -- https://github.com/zinnia-phatak-dev/Nectar
Re: Bad connect ack with firstBadLink
Please see: http://hbase.apache.org/book.html#dfs.datanode.max.xcievers On Fri, May 4, 2012 at 5:46 AM, madhu phatak phatak@gmail.com wrote: Hi, We are running a three node cluster . From two days whenever we copy file to hdfs , it is throwing java.IO.Exception Bad connect ack with firstBadLink . I searched in net, but not able to resolve the issue. The following is the stack trace from datanode log 2012-05-04 18:08:08,868 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-7520371350112346377_50118 received exception java.net.SocketException: Connection reset 2012-05-04 18:08:08,869 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 172.23.208.17:50010, storageID=DS-1340171424-172.23.208.17-50010-1334672673051, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:662) It will be great if some one can point to the direction how to solve this problem. -- https://github.com/zinnia-phatak-dev/Nectar
Re: Bad connect ack with firstBadLink
Check your number of blocks in the cluster. This also indicates that your datanodes are more full than they should be. Try deleting unnecessary blocks. On Fri, May 4, 2012 at 7:40 AM, Mohit Anchlia mohitanch...@gmail.comwrote: Please see: http://hbase.apache.org/book.html#dfs.datanode.max.xcievers On Fri, May 4, 2012 at 5:46 AM, madhu phatak phatak@gmail.com wrote: Hi, We are running a three node cluster . From two days whenever we copy file to hdfs , it is throwing java.IO.Exception Bad connect ack with firstBadLink . I searched in net, but not able to resolve the issue. The following is the stack trace from datanode log 2012-05-04 18:08:08,868 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-7520371350112346377_50118 received exception java.net.SocketException: Connection reset 2012-05-04 18:08:08,869 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 172.23.208.17:50010, storageID=DS-1340171424-172.23.208.17-50010-1334672673051, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:662) It will be great if some one can point to the direction how to solve this problem. -- https://github.com/zinnia-phatak-dev/Nectar
java.io.IOException: Bad connect ack with firstBadLink
Hi , running hadoop map/reduce got such exception? 1) Why does it happen? 2) Job didn't failed and continue it's execution? Does this exception cause losing data or map/reduce uses recovery mechanism? 2010-11-09 05:10:08,735 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:10:08,735 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-9208177033562590356_775948 2010-11-09 05:10:08,739 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:11:23,743 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.11.87.61:55309 remote=/10.11.87.61:50010] 2010-11-09 05:11:23,743 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-10251707095594311_775950 2010-11-09 05:11:23,744 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:12:29,815 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:12:29,816 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_3509928762116143133_775950 2010-11-09 05:12:29,818 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:13:35,949 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:13:35,949 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_3138002906377068146_775950 2010-11-09 05:13:35,950 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:13:51,757 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262) 2010-11-09 05:13:51,757 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_3138002906377068146_775950 bad datanode[2] nodes == null 2010-11-09 05:13:51,758 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file /user/hadoop/requests_logs/merged/2010-11-08/_temporary/_attempt_201011081008_0002_r_08_0/part-r-8 - Aborting... 2010-11-09 05:13:51,760 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262) 2010-11-09 05:13:54,888 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the task Thanks Oleg. --
Re: java.io.IOException: Bad connect ack with firstBadLink
What does it mean: It looks like second attempt to process data after the first time it failed? All Task AttemptsTask AttemptsMachineStatusProgressStart TimeShuffle FinishedSort FinishedFinish TimeErrorsTask LogsCountersActions attempt_201011081008_0002_r_08_0/default-rack/http://hadoop1.infolinks.local:8022 FAILED0.00%9-Nov-2010 04:36:159-Nov-2010 05:09:08 (32mins, 52sec)9-Nov-2010 05:09:08 (0sec)9-Nov-2010 05:14:07 (37mins, 51sec) java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262) Last 4KBhttp://hadoop1.infolinks.local:8022/tasklog?taskid=attempt_201011081008_0002_r_08_0start=-4097 Last 8KBhttp://hadoop1.infolinks.local:8022/tasklog?taskid=attempt_201011081008_0002_r_08_0start=-8193 Allhttp://hadoop1.infolinks.local:8022/tasklog?taskid=attempt_201011081008_0002_r_08_0all=true 10/taskstats.jsp?jobid=job_201011081008_0002tipid=task_201011081008_0002_r_08taskid=attempt_201011081008_0002_r_08_0 attempt_201011081008_0002_r_08_1/default-rack/http://hadoop-transfer.infolinks.local:8022 RUNNING24.98%9-Nov-2010 05:50:21 On Tue, Nov 9, 2010 at 12:58 PM, Oleg Ruchovets oruchov...@gmail.comwrote: Hi , running hadoop map/reduce got such exception? 1) Why does it happen? 2) Job didn't failed and continue it's execution? Does this exception cause losing data or map/reduce uses recovery mechanism? 2010-11-09 05:10:08,735 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:10:08,735 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-9208177033562590356_775948 2010-11-09 05:10:08,739 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:11:23,743 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.11.87.61:55309 remote=/10.11.87.61:50010] 2010-11-09 05:11:23,743 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-10251707095594311_775950 2010-11-09 05:11:23,744 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:12:29,815 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:12:29,816 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_3509928762116143133_775950 2010-11-09 05:12:29,818 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:13:35,949 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 2010-11-09 05:13:35,949 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_3138002906377068146_775950 2010-11-09 05:13:35,950 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node: 10.11.87.61:50010 2010-11-09 05:13:51,757 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262) 2010-11-09 05:13:51,757 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_3138002906377068146_775950 bad datanode[2] nodes == null 2010-11-09 05:13:51,758 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file /user/hadoop/requests_logs/merged/2010-11-08/_temporary/_attempt_201011081008_0002_r_08_0/part-r-8 - Aborting... 2010-11-09 05:13:51,760 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: Bad connect ack with firstBadLink 10.11.87.65:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262) 2010-11-09 05:13:54,888 INFO org.apache.hadoop.mapred.TaskRunner