[ 
https://issues.apache.org/jira/browse/HBASE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847337#comment-13847337
 ] 

Anoop Sam John commented on HBASE-10148:
----------------------------------------

The issue is very clear now.
When log replay is in place, postOpen() hook is getting called before any 
replay. In case of Visibility , we use this hook to read the existing labels 
and initialize the cache. So here what happens is the cache is getting no items 
at all.  Also we calculate the next ordinal no# for adding label in this 
postOpen(). It comes as 1 again. While doing the 2 new entries write, it gets 
ordinals as 1 and 2 replacing some old labels !!!

That is why the test fails with number of labels after adding 2 new labels 
(already 6 in place)  as 6 only.

Similar issue is there with AccessController also in which the existing data is 
cached in postOpen(). If some unflushed acls are in place, then those will be 
lost to get added into the cache (up on RS restart)..

I am trying with possible solutions.

Can we think of a hook with replay like postReplay() or so?

> [VisibilityController] Tolerate regions in recovery
> ---------------------------------------------------
>
>                 Key: HBASE-10148
>                 URL: https://issues.apache.org/jira/browse/HBASE-10148
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0
>            Reporter: Andrew Purtell
>            Assignee: Anoop Sam John
>             Fix For: 0.98.0
>
>
> Ted Yu reports that enabling distributed log replay by default, like:
> {noformat}
> Index: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
> ===================================================================
> --- hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java        
> (revision 1550575)
> +++ hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java        
> (working copy)
> @@ -794,7 +794,7 @@
>    /** Conf key that enables unflushed WAL edits directly being replayed to 
> region servers */
>    public static final String DISTRIBUTED_LOG_REPLAY_KEY = 
> "hbase.master.distributed.log.replay";
> -  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = false;
> +  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = true;
>    public static final String DISALLOW_WRITES_IN_RECOVERING =
>        "hbase.regionserver.disallow.writes.when.recovering";
>    public static final boolean DEFAULT_DISALLOW_WRITES_IN_RECOVERING_CONFIG = 
> false;
> {noformat}
> causes TestVisibilityController#testAddVisibilityLabelsOnRSRestart to fail. 
> It reveals an issue with label operations if the label table is recovering:
> {noformat}
> 2013-12-12 14:53:53,133 DEBUG [RpcServer.handler=2,port=58108] 
> visibility.VisibilityController(1046): Adding the label XYZ2013-12-12 
> 14:53:53,137 ERROR [RpcServer.handler=2,port=58108] 
> visibility.VisibilityController(1074): 
> org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
> hbase:labels,,1386888826648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering
> 2013-12-12 14:53:53,151 DEBUG [main] visibility.TestVisibilityLabels(405): 
> response from addLabels: result {
>   exception {
>     name: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException"
>     value: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
> hbase:labels,,1386888826648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5555)
>  at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1763) at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1749) at 
> org.apache.hadoop.hbase.security.visibility.VisibilityController.getExistingLabelsWithAuths(VisibilityController.java:1096)
>  at 
> org.apache.hadoop.hbase.security.visibility.VisibilityController.postBatchMutate(VisibilityController.java:672)"
> {noformat}
> Should we try to ride over this?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to