Hi Ted, Yes i checked namenode and datanode logs and i found below exceptions in both the logs:-
Name node :- java.io.IOException: File /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e could only be replicated to 0 nodes, instead of 1 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-2949905629769882833_52274 Data node :- 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.20.30:50010 remote=/192.168.20.30:36188] ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException: while trying to read 39309 bytes On Tue, Oct 22, 2013 at 10:19 PM, Ted Yu <yuzhih...@gmail.com> wrote: > bq. java.io.IOException: File /hbase/event_data/ > 4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0 > could > only be replicated to 0 nodes, instead of 1 > > Have you checked Namenode / Datanode logs ? > Looks like hdfs was not stable. > > > On Tue, Oct 22, 2013 at 9:01 AM, Vimal Jain <vkj...@gmail.com> wrote: > > > HI Jean, > > Thanks for your reply. > > I have total 8 GB memory and distribution is as follows:- > > > > Region server - 2 GB > > Master,Namenode,Datanode,Secondary Namenode,Zookepeer - 1 GB > > OS - 1 GB > > > > Please let me know if you need more information. > > > > > > On Tue, Oct 22, 2013 at 8:15 PM, Jean-Marc Spaggiari < > > jean-m...@spaggiari.org> wrote: > > > > > Hi Vimal, > > > > > > What are your settings? Memory of the host, and memory allocated for > the > > > different HBase services? > > > > > > Thanks, > > > > > > JM > > > > > > > > > 2013/10/22 Vimal Jain <vkj...@gmail.com> > > > > > > > Hi, > > > > I am running in Hbase in pseudo distributed mode. ( Hadoop version - > > > 1.1.2 > > > > , Hbase version - 0.94.7 ) > > > > I am getting few exceptions in both hadoop ( namenode , datanode) > logs > > > and > > > > hbase(region server). > > > > When i search for these exceptions on google , i concluded that > > problem > > > is > > > > mainly due to large number of full GC in region server process. > > > > > > > > I used jstat and found that there are total of 950 full GCs in span > of > > 4 > > > > days for region server process.Is this ok? > > > > > > > > I am totally confused by number of exceptions i am getting. > > > > Also i get below exceptions intermittently. > > > > > > > > > > > > Region server:- > > > > > > > > 2013-10-22 12:00:26,627 WARN org.apache.hadoop.ipc.HBaseServer: > > > > (responseTooSlow): > > > > {"processingtimems":15312,"call":"next(-6681408251916104762, 1000), > rpc > > > > version=1, client version=29, > > methodsFingerPrint=-1368823753","client":" > > > > 192.168.20.31:48270 > > > > > > > > > > > > > > ","starttimems":1382423411293,"queuetimems":0,"class":"HRegionServer","responsesize":4808556,"method":"next"} > > > > 2013-10-22 12:06:17,606 WARN org.apache.hadoop.ipc.HBaseServer: > > > > (operationTooSlow): {"processingtimems":14759,"client":" > > > > 192.168.20.31:48247 > > > > > > > > > > > > > > ","timeRange":[0,9223372036854775807],"starttimems":1382423762845,"responsesize":61,"class":"HRegionServer","table":"event_data","cacheBlocks":true,"families":{"ginfo":["netGainPool"]},"row":"1629657","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1} > > > > > > > > 2013-10-18 10:37:45,008 WARN org.apache.hadoop.hdfs.DFSClient: > > > DataStreamer > > > > Exception: org.apache.hadoop.ipc.RemoteException: > java.io.IOException: > > > File > > > > > > > > > > > > > > /hbase/event_data/4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0 > > > > could only be replicated to 0 nodes, instead of 1 > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639) > > > > > > > > Name node :- > > > > java.io.IOException: File > > > > > > > > > > > > > > /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e > > > > could only be replicated to 0 nodes, instead of 1 > > > > > > > > java.io.IOException: Got blockReceived message from unregistered or > > dead > > > > node blk_-2949905629769882833_52274 > > > > > > > > Data node :- > > > > 480000 millis timeout while waiting for channel to be ready for > write. > > > ch : > > > > java.nio.channels.SocketChannel[connected local=/192.168.20.30:50010 > > > > remote=/ > > > > 192.168.20.30:36188] > > > > > > > > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: > > > > DatanodeRegistration( > > > > 192.168.20.30:50010, > > > > storageID=DS-1816106352-192.168.20.30-50010-1369314076237, > > > infoPort=50075, > > > > ipcPort=50020):DataXceiver > > > > java.io.EOFException: while trying to read 39309 bytes > > > > > > > > > > > > -- > > > > Thanks and Regards, > > > > Vimal Jain > > > > > > > > > > > > > > > -- > > Thanks and Regards, > > Vimal Jain > > > -- Thanks and Regards, Vimal Jain