One of our production clusters had several region server failures. As a result one of the tables is in an inconsistent state as reported by hbck. We have tried using hbck repair commands but none seem to work. There is one region that is stuck in a forever pending open state. The error reported in RS log is about a StoreFile not found. But what is really strange is that the store file that is reported as missing does not even belong to the region being opened.

We tried to manually create a directory in HDFS and copy the missing file but it causes hbck to report about a region in HDFS but not in Meta.

There 4 inconsistencies currently.

ERROR: Region { meta => <tableName>,I.1521_D.1361689200_9,1369099149747.2123fc70fac804cd8d48ea4494cc8184., hdfs => hdfs://host:8020/hbase/tableName/2123fc70fac804cd8d48ea4494cc8184, deployed => } not deployed on any region server. ERROR: Region { meta => null, hdfs => hdfs://hostname:8020/hbase/tableName/450ed30b410e9d6d54ac53099039cb28, deployed => } on HDFS, but not listed in META or deployed on any region server
13/05/21 10:51:11 DEBUG util.HBaseFsck: There are 1769 region info entries
ERROR: There is a hole in the region chain between I.1521_D.1361689200_9 and I.1521_D.1362150000_8. You need to create a new .regioninfo and region dir in hdfs to plug the hole. ERROR: There is a hole in the region chain between I.1_D.1368392400_9 and I.2020_D.1338948000_2. You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: Found inconsistency in table <tableName>

We are running Hbase 0.94 (Apache) on Hadoop 1.0.3

At this stage, we are stuck and are looking for help ! The cluster is in an unbalanced state and region servers frequently keep dying.

Thanks,
Jay

Reply via email to