Can you try the 0.19.3 RC1 posted yesterday: http://people.apache.org/~stack/hbase-0.19.3-candidate-1/
It has fixes that should help stop master and regionserver get into state of disagreement (In particular, HBASE-1421 and HBASE-1344). Thanks, St.Ack On Thu, May 21, 2009 at 11:15 AM, Kirill Shabunov <[email protected]> wrote: > Hi! > > I am running HBase 0.19.2, r771918 on top of Hadoop 0.19.1, r745977. > > When I stress the system with lots of uploaded data it often happens a > Regionserver gets overloaded and is lost by the cluster. This is > understandable. However, right after that Hbase becomes generally unstable > and then often "loses" some of the regions making the DB tables corrupt. The > aftermath symptoms are: > 1. A region appears in the "Regions in" list for its table. > 3. That region is missing from the "Online Regions" list of the > Regionserver responsible for it. > > In other words, it seems, Master thinks region R belongs to the > regionserver X, but X does not agree. When I request data from that region > Master directs the client to X and X throws NotServingRegionException. > > Has anybody met this kind of problem? Is there any remedy for this > apart from adding more power to the cluster so the regions do not fail? > > It is acceptable if the system becomes unavailable for some time when it > get stressed too much, but it should not lose data. > > Thanks a lot! > > --Kirill >
