Jean-Daniel Cryans created HBASE-6294:
-----------------------------------------

             Summary: Detect leftover data in ZK after a user delete all its 
HBase data
                 Key: HBASE-6294
                 URL: https://issues.apache.org/jira/browse/HBASE-6294
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 0.94.0
            Reporter: Jean-Daniel Cryans
            Priority: Critical
             Fix For: 0.96.0, 0.94.1


It seems we have a new failure mode when a user deletes the hbase root.dir but 
doesn't delete the ZK data. For example a user on IRC came with this log:

{noformat}
2012-06-30 09:07:48,017 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
region: kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.
2012-06-30 09:07:48,017 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: 
The following folder is in HBase's root directory and doesn't contain a table 
descriptor, do consider deleting it: kw
2012-06-30 09:07:48,018 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:34193-0x1383bfe01b70001 Attempting to transition node 
2e8a318837602c9c9961e9d690b7fd02 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2012-06-30 09:07:48,018 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=M_ZK_REGION_OFFLINE, server=localhost,50890,1341036299694, 
region=2e8a318837602c9c9961e9d690b7fd02
2012-06-30 09:07:48,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_FAILED_OPEN, 
server=localhost,34193,1341036300138, region=b254af24c9127b8bb22cb6d24e523dad
2012-06-30 09:07:48,020 DEBUG 
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
event for b254af24c9127b8bb22cb6d24e523dad
2012-06-30 09:07:48,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Forcing OFFLINE; was=kw_r,,1340981822374.b254af24c9127b8bb22cb6d24e523dad. 
state=CLOSED, ts=1341036467998, server=localhost,34193,1341036300138
2012-06-30 09:07:48,020 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:50890-0x1383bfe01b70000 Creating (or updating) unassigned node for 
b254af24c9127b8bb22cb6d24e523dad with OFFLINE state
2012-06-30 09:07:48,028 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:34193-0x1383bfe01b70001 Successfully transitioned node 
2e8a318837602c9c9961e9d690b7fd02 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2012-06-30 09:07:48,028 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Opening region: {NAME => 'kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.', 
STARTKEY => '', ENDKEY => '', ENCODED => 2e8a318837602c9c9961e9d690b7fd02,}
2012-06-30 09:07:48,029 ERROR 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of 
region=kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02., starting to roll 
back the global memstore size.
java.lang.IllegalStateException: Could not instantiate a region instance.
        at 
org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3490)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3628)
        at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
        at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3487)
        ... 7 more
Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:133)
        at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:125)
        at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:411)
        ... 11 more
2012-06-30 09:07:48,031 INFO 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
region {NAME => 'kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.', STARTKEY 
=> '', ENDKEY => '', ENCODED => 2e8a318837602c9c9961e9d690b7fd02,} failed, 
marking as FAILED_OPEN in ZK
2012-06-30 09:07:48,032 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:34193-0x1383bfe01b70001 Attempting to transition node 
2e8a318837602c9c9961e9d690b7fd02 from RS_ZK_REGION_OPENING to 
RS_ZK_REGION_FAILED_OPEN
2012-06-30 09:07:48,031 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENING, server=localhost,34193,1341036300138, 
region=2e8a318837602c9c9961e9d690b7fd02
2012-06-30 09:07:48,043 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=M_ZK_REGION_OFFLINE, server=localhost,50890,1341036299694, 
region=b254af24c9127b8bb22cb6d24e523dad
{noformat}

The exception itself is not very useful, nor is the NPE deep in the coproc 
stack. What was really useful was this:

{noformat}
2012-06-30 09:07:48,017 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: 
The following folder is in HBase's root directory and doesn't contain a table 
descriptor, do consider deleting it: kw
{noformat}

So the HBase wants to assign a region from a table that doesn't exist and we 
fail in an obscure way. I told the user to shut down HBase, nuke 
/tmp/hbase-user as it will contain both the HBase data and the ZK data, and 
restart. It worked.

This situation is new in 0.94, we need to detect it so our users have a better 
experience getting started with HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to