I found the solution here : http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/
J-D On Fri, Mar 6, 2009 at 6:08 PM, Jean-Daniel Cryans <jdcry...@apache.org> wrote: > I know this one may be weird, but I'll give it a try. Thanks to anyone > reading this through. > > Setup : hadoop-0.19.0 with hbase-0.19.0 on 10 nodes, quads with 8GB RAM, 2 > disks. nofile limit is set at 30 000, xceivers at 1023, > dfs.datanode.socket.write.timeout at 0, dfs.datanode.handler.count at 9. > > I killed one of the datanodes (192.168.1.105) this morning and the following > happened while the HBase Master was splitting the logs : > > 2009-03-06 08:46:14,685 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Bad connect ack with > firstBadLink 192.168.1.105:52010 > 2009-03-06 08:46:14,685 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-4843189039542448358_284096 > 2009-03-06 08:46:23,685 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Bad connect ack with > firstBadLink 192.168.1.105:52010 > 2009-03-06 08:46:23,686 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-7443188992680225469_284110 > 2009-03-06 08:46:41,694 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.net.NoRouteToHostException: No route to host > 2009-03-06 08:46:41,694 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-8720774789369741650_284121 > 2009-03-06 08:46:41,698 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to > find target node: 192.168.1.105:52010 > 2009-03-06 08:46:47,699 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > > 2009-03-06 08:46:47,699 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8720774789369741650_284121 bad datanode[0] nodes == > null > 2009-03-06 08:46:47,699 WARN org.apache.hadoop.hdfs.DFSClient: Could not get > block locations. Aborting... > > > Same stuff happens over and over then > > > 2009-03-06 08:53:55,196 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-306786979979478823_284944 bad datanode[0] > 192.168.1.104:52010 > 2009-03-06 08:53:55,197 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-306786979979478823_284944 in pipeline > 192.168.1.104:52010, 192.168.1.103:52010, 192.168.1.106:52010: bad datanode > 192.168.1.104:52010 > 2009-03-06 08:53:55,700 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Too many open files > 2009-03-06 08:53:55,701 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-8714269710441125116_284946 > 2009-03-06 08:54:01,702 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Too many open files > 2009-03-06 08:54:01,703 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-159936474097522623_284946 > 2009-03-06 08:54:07,704 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Too many open files > 2009-03-06 08:54:07,704 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_181826096500794536_284946 > 2009-03-06 08:54:13,706 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.IOException: Too many open files > 2009-03-06 08:54:13,706 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-6644123541891903941_284946 > 2009-03-06 08:54:19,706 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > > > Then that's all I see, no "Bad connect ack..." only "Too many open files". > > Any ideas? > > Thx > > J-D > >