[ 
https://issues.apache.org/jira/browse/HBASE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435592#comment-13435592
 ] 

stack commented on HBASE-6294:
------------------------------

@Deveraj So that seems like a decent workaround.

Going back to J-Ds' original comment, sounds like we shouldn't be assigning 
regions for tables that don't exist.  Or if a regionserver gets a region to 
open that is for a non-existent table, it should just eat it up with a nice log 
message.

@LarsG Should we make a new issue for that?  Seems like again we should eat up 
the zk data if no corresponding table in HDFS/.META. and proceed?

@J-D Lars says "We typically say that the ZK is not important for operating 
HBase, but that is not strictly true. For example we need to the ZK state for 
replication."

Can we fix that?  It'd be cool if we could keep the axiom that zk state is 
transient.  Or maybe, for the likes of data that needs to prevail across 
restarts and upgrades, it should be recorded elsewhere in zk, outside of the 
per-cluster location?


                
> Detect leftover data in ZK after a user delete all its HBase data
> -----------------------------------------------------------------
>
>                 Key: HBASE-6294
>                 URL: https://issues.apache.org/jira/browse/HBASE-6294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.94.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.2
>
>
> It seems we have a new failure mode when a user deletes the hbase root.dir 
> but doesn't delete the ZK data. For example a user on IRC came with this log:
> {noformat}
> 2012-06-30 09:07:48,017 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
> region: kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.
> 2012-06-30 09:07:48,017 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: 
> The following folder is in HBase's root directory and doesn't contain a table 
> descriptor, do consider deleting it: kw
> 2012-06-30 09:07:48,018 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:34193-0x1383bfe01b70001 Attempting to transition node 
> 2e8a318837602c9c9961e9d690b7fd02 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2012-06-30 09:07:48,018 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=localhost,50890,1341036299694, 
> region=2e8a318837602c9c9961e9d690b7fd02
> 2012-06-30 09:07:48,020 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_FAILED_OPEN, server=localhost,34193,1341036300138, 
> region=b254af24c9127b8bb22cb6d24e523dad
> 2012-06-30 09:07:48,020 DEBUG 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
> event for b254af24c9127b8bb22cb6d24e523dad
> 2012-06-30 09:07:48,020 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=kw_r,,1340981822374.b254af24c9127b8bb22cb6d24e523dad. state=CLOSED, 
> ts=1341036467998, server=localhost,34193,1341036300138
> 2012-06-30 09:07:48,020 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:50890-0x1383bfe01b70000 Creating (or updating) unassigned node for 
> b254af24c9127b8bb22cb6d24e523dad with OFFLINE state
> 2012-06-30 09:07:48,028 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:34193-0x1383bfe01b70001 Successfully transitioned node 
> 2e8a318837602c9c9961e9d690b7fd02 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2012-06-30 09:07:48,028 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Opening region: {NAME => 
> 'kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.', STARTKEY => '', ENDKEY 
> => '', ENCODED => 2e8a318837602c9c9961e9d690b7fd02,}
> 2012-06-30 09:07:48,029 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02., starting to 
> roll back the global memstore size.
> java.lang.IllegalStateException: Could not instantiate a region instance.
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3490)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3628)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>       at java.lang.Thread.run(Thread.java:679)
> Caused by: java.lang.reflect.InvocationTargetException
>       at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown 
> Source)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3487)
>       ... 7 more
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:133)
>       at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:125)
>       at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:411)
>       ... 11 more
> 2012-06-30 09:07:48,031 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region {NAME => 'kw,,1340981821308.2e8a318837602c9c9961e9d690b7fd02.', 
> STARTKEY => '', ENDKEY => '', ENCODED => 2e8a318837602c9c9961e9d690b7fd02,} 
> failed, marking as FAILED_OPEN in ZK
> 2012-06-30 09:07:48,032 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:34193-0x1383bfe01b70001 Attempting to transition node 
> 2e8a318837602c9c9961e9d690b7fd02 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2012-06-30 09:07:48,031 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=localhost,34193,1341036300138, 
> region=2e8a318837602c9c9961e9d690b7fd02
> 2012-06-30 09:07:48,043 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=localhost,50890,1341036299694, 
> region=b254af24c9127b8bb22cb6d24e523dad
> {noformat}
> The exception itself is not very useful, nor is the NPE deep in the coproc 
> stack. What was really useful was this:
> {noformat}
> 2012-06-30 09:07:48,017 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: 
> The following folder is in HBase's root directory and doesn't contain a table 
> descriptor, do consider deleting it: kw
> {noformat}
> So the HBase wants to assign a region from a table that doesn't exist and we 
> fail in an obscure way. I told the user to shut down HBase, nuke 
> /tmp/hbase-user as it will contain both the HBase data and the ZK data, and 
> restart. It worked.
> This situation is new in 0.94, we need to detect it so our users have a 
> better experience getting started with HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to