[ 
https://issues.apache.org/jira/browse/HBASE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3381:
-------------------------

    Attachment: 3381.txt

Here is the patch.  I've not been able to repro the condition during last few 
hours of testing so would like to commit this (need a +1 -- Jon?).  While in 
here, I did some cleanup of hbck messages and stopped it claiming error when 
offlined split parent.  Also added logging around fixup of case where parent 
offlining edit got in but not daughter addtions; needed debugging.

> Interrupt of a region open comes across as a successful open
> ------------------------------------------------------------
>
>                 Key: HBASE-3381
>                 URL: https://issues.apache.org/jira/browse/HBASE-3381
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.90.0
>
>         Attachments: 3381.txt
>
>
> Meta was offline when below happened:
> {code}
> 2010-12-21 19:45:23,023 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:60020-0x12d0a53c540000e Attempting to transition node 
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_OPENING
> 2010-12-21 19:45:23,046 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:60020-0x12d0a53c540000e Successfully transitioned node 
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_OPENING
> 2010-12-21 19:45:26,379 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Interrupting 
> thread Thread[PostOpenDeployTasks:337038b50e467fbd6b031f278bbd9c22,5,main]
> 2010-12-21 19:45:26,379 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:60020-0x12d0a53c540000e Attempting to transition node 
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_OPENED
> 2010-12-21 19:45:26,381 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=337038b50e467fbd6b031f278bbd9c22
> org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Interrupted
>     at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
>     at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateRegionLocation(MetaEditor.java:146)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1331)
>     at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:195)
> ...
> {code}
> So, we timed out trying to open the region but rather than close the region 
> because edit failed, we missed seeing the InterruptedException.
> Here is suggested fix:
> {code}
> diff --git a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 
> b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> index 7bf680d..2b0078c 100644
> --- a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> +++ b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> @@ -339,7 +339,7 @@ public class MetaReader {
>      get.addFamily(HConstants.CATALOG_FAMILY);
>      byte [] meta = getCatalogRegionNameForRegion(regionName);
>      Result r = catalogTracker.waitForMetaServerConnectionDefault().get(meta, 
> get);
> -    if(r == null || r.isEmpty()) {
> +    if (r == null || r.isEmpty()) {
>        return null;
>      }
>      return metaRowToRegionPair(r);
> {code}
> Let me try it.
> W/o this, what we see is hbck showing that region is on server X but in 
> .META. it shows as being on Y (its pre-balance server)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to