Yeah, see HADOOP-3831. It looks like datanode timing out unused connections. As I understand it, later, when dfsclient wants to use this block, it just sets up the socket again -- silently, transparently below the level at which the application can see. Do I have it right? Is hbase itself complaining?
St.Ack On Wed, Dec 23, 2009 at 11:10 AM, Ken Weiner <[email protected]> wrote: > We have seen the following HADOOP error occur about 100 times a day spread > out thoughout the day on each RegionServer/DataNode in our always-on > HBase/Hadoop cluster. > > From *hadoop-gumgum-datanode-xxxxxxxxxxxx.log* > > *2009-12-23* *09:58:29*,*717* *ERROR* > *org.apache.hadoop.hdfs.server.datanode.DataNode:* > *DatanodeRegistration*(*10.255.9.187:50010*, > *storageID=DS-1057956046-10.255.9.187-50010-1248395287725*, > *infoPort=50075*, *ipcPort=50020*)*:DataXceiver* > *java.net.SocketTimeoutException:* *480000* *millis* *timeout* *while* > *waiting* *for* *channel* *to* *be* *ready* *for* *write.* *ch* *:* > *java.nio.channels.SocketChannel*[*connected* > *local=/10.255.9.187:50010* *remote=/10.255.9.187:46154*] > *at* > *org.apache.hadoop.net.SocketIOWithTimeout.waitForIO*(*SocketIOWithTimeout.java:246*) > *at* > *org.apache.hadoop.net.SocketOutputStream.waitForWritable*(*SocketOutputStream.java:159*) > *at* > *org.apache.hadoop.net.SocketOutputStream.transferToFully*(*SocketOutputStream.java:198*) > *at* > *org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks*(*BlockSender.java:313*) > *at* > *org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock*(*BlockSender.java:400*) > *at* > *org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock*(*DataXceiver.java:180*) > *at* > *org.apache.hadoop.hdfs.server.datanode.DataXceiver.run*(*DataXceiver.java:95*) > *at* *java.lang.Thread.run*(*Thread.java:619*) > > > Are other people seeing this error too? How serious is it? Can it be > prevented? > > I found a few things that seem related, but I'm not sure how they apply to > the HBase environment: > http://issues.apache.org/jira/browse/HDFS-693 > https://issues.apache.org/jira/browse/HADOOP-3831 > > Info on our environment: > 1 Node: Master/NameNode/JobTracker (EC2 m1.large) > 3 Nodes: RegionServer/DataNode/TaskTracker (EC2 m1.large) > > Thanks! > > -Ken Weiner > GumGum & BEDROCK >
