Can you use pastebin to add the whole log?
On Tue, Apr 22, 2014 at 12:25 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Can you post more of the data node log, around 20:33 ? > > Cheers > > > On Mon, Apr 21, 2014 at 8:57 PM, Li Li <fancye...@gmail.com> wrote: > > > hadoop 1.0 > > hbase 0.94.11 > > > > datanode log from 192.168.10.45. why it shut down itself? > > > > 2014-04-21 20:33:59,309 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > > blk_-7969006819959471805_202154 received exception > > java.io.InterruptedIOException: Interruped while waiting for IO on > > channel java.nio.channels.SocketChannel[closed]. 0 millis timeout > > left. > > 2014-04-21 20:33:59,310 ERROR > > org.apache.hadoop.hdfs.server.datanode.DataNode: > > DatanodeRegistration(192.168.10.45:50010, > > storageID=DS-1676697306-192.168.10.45-50010-1392029190949, > > infoPort=50075, ipcPort=50020):DataXceiver > > java.io.InterruptedIOException: Interruped while waiting for IO on > > channel java.nio.channels.SocketChannel[closed]. 0 millis timeout > > left. > > at > > > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > > at > > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > > at > > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > > at > > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > > at > java.io.BufferedInputStream.read1(BufferedInputStream.java:273) > > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > > at java.io.DataInputStream.read(DataInputStream.java:149) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:312) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:376) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:532) > > at > > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:398) > > at > > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107) > > at java.lang.Thread.run(Thread.java:722) > > 2014-04-21 20:33:59,310 ERROR > > org.apache.hadoop.hdfs.server.datanode.DataNode: > > DatanodeRegistration(192.168.10.45:50010, > > storageID=DS-1676697306-192.168.10.45-50010-1392029190949, > > infoPort=50075, ipcPort=50020):DataXceiver > > java.io.InterruptedIOException: Interruped while waiting for IO on > > channel java.nio.channels.SocketChannel[closed]. 466924 millis timeout > > left. > > at > > > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > > at > > > org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:245) > > at > > > org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) > > at > > > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350) > > at > > > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436) > > at > > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197) > > at > > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99) > > at java.lang.Thread.run(Thread.java:722) > > 2014-04-21 20:34:00,291 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for > > threadgroup to exit, active threads is 0 > > 2014-04-21 20:34:00,404 INFO > > org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: > > Shutting down all async disk service threads... > > 2014-04-21 20:34:00,405 INFO > > org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All > > async disk service threads have been shut down. > > 2014-04-21 20:34:00,413 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode > > 2014-04-21 20:34:00,424 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: > > /************************************************************ > > SHUTDOWN_MSG: Shutting down DataNode at app-hbase-1/192.168.10.45 > > ************************************************************/ > > > > On Tue, Apr 22, 2014 at 11:25 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > bq. one datanode failed > > > > > > Was the crash due to out of memory error ? > > > Can you post the tail of data node log on pastebin ? > > > > > > Giving us versions of hadoop and hbase would be helpful. > > > > > > > > > On Mon, Apr 21, 2014 at 7:39 PM, Li Li <fancye...@gmail.com> wrote: > > > > > >> I have a small hbase cluster with 1 namenode, 1 secondary namenode, 4 > > >> datanode. > > >> and the hbase master is on the same machine with namenode, 4 hbase > > >> slave on datanode machine. > > >> I found average requests per seconds is about 10,000. and the clusters > > >> crashed. and I found the reason is one datanode failed. > > >> > > >> the datanode configuration is about 4 cpu core and 10GB memory > > >> is my cluster overloaded? > > >> > > >