We've been using HBase for about a year, consistenly running into problems where we lost data. After reading forums and some back and forth with other Hbase users, we changed our data methodology to save less data per row. This last time, we upgraded to 0.90 at the recommendation of the hbase community, cleared off all our data, and started over. Seemed to be running ok for a couple of months, until this morning. One of the regionservers stopped responding to data requests and we tried to restart it to no avail. Then we shutdown our processes so that nothing was using HBase and we shut down HBase and brought it back up. We waited a little bit, until hbase status indicated that all the servers were back up. We turned on our processes and lo and behold, HBase is broken, getting org.apache.hadoop.hbase. NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: -ROOT-,,0 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2319) at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1607) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1036)
And now we can't even shut it down. Seems that Hbase is just too flaky to depend on for a serious system, we've not had this type of problem to this degree with conventional DB systems. Now that we are not saving that much data (we are using large hdfs files for that) in Hbase, we are probably going to move back to a conventional SQL system for our control data. We just can't afford to be constantly fighting problems like this. -- Robert Gonzalez Maxpoint Interactive
