Re: Strange HBase failure

2015-01-12 Thread Serega Sheypak
Ok, thanks, we'll check it. 2015-01-12 11:28 GMT+03:00 Esteban Gutierrez : > Hi Serega, > > Do you have enough resources allocated for each VM? Just some swapping on > the VMs or the host can make things unstable. Also from the number of > services on each VM sounds like your host should have at

Re: Strange HBase failure

2015-01-12 Thread Esteban Gutierrez
Hi Serega, Do you have enough resources allocated for each VM? Just some swapping on the VMs or the host can make things unstable. Also from the number of services on each VM sounds like your host should have at least 12GB of free RAM just for running things smoothly otherwise you might want to tr

Re: Strange HBase failure

2015-01-11 Thread Serega Sheypak
Hi, HBase was down during 08:25 to 09:15 I was looking into logs, and thinking. I've tried to find something more clever. than dummy restart. We are using Cloudera distro, each of daemons run in it's own JVM. I'll try to find CPU load logs. There is really low load, Finished memstore flush of ~7.7

Re: Strange HBase failure

2015-01-11 Thread Ted Yu
Serega: Was the snippet of log from NODE01 ? Looks like NODE01 may have been under heavy load - considering the number of daemons running on that node. Please check GC log. Cheers On Sun, Jan 11, 2015 at 6:57 PM, Shuai Lin wrote: > From the log I see no log was produced during 08:25 to 09:15,

Re: Strange HBase failure

2015-01-11 Thread Shuai Lin
>From the log I see no log was produced during 08:25 to 09:15, why did this happen? 08:25:06.274INFOorg.apache. hadoop.hbase.regionserver.wal.HLog moving old hlog file /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.142085670602

Strange HBase failure

2015-01-11 Thread Serega Sheypak
Hi, I have PoC HBase cluster running on 3 VM deployment schema is: NODE01 NN, SN, HMaster (HM), RegionServer (RS), Zookeeper server (ZK), DN NODE02 RegionServer, DN NODE03 RegionServer, DN Suddenly ONLY HBase went offline, all services: HM RS HDFS was working, no alerts were there ZK server was wo