[ 
https://issues.apache.org/jira/browse/HBASE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336098#comment-14336098
 ] 

Jerry He commented on HBASE-13084:
----------------------------------

FYI. While working on HBASE-12644, I recall seeing ZK events (data changed 
events) lost (not captured) by the ZK watcher due to race condition.
See this comment, 
https://issues.apache.org/jira/browse/HBASE-12644?focusedCommentId=14242242&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242242
It may not be the same issue.
In the test Surefire reports, there should be detailed ZK events debug logging. 
That will help to make sure at least no visibility ZK events are lost due to 
race condition.
One idea is put the wait in Before the addLabels call to make sure all prior ZK 
events have been processed and watchs are reset.

On the other hand, in waitLabelAvailable() in the patch, try to use the 
listLabels() from the client to see if the labels are really in there.
 
Just threw in some ideas.

> Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey
> --------------------------------------------------------------------------
>
>                 Key: HBASE-13084
>                 URL: https://issues.apache.org/jira/browse/HBASE-13084
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13084.patch, HBASE-13084_1.patch, 
> HBASE-13084_2.patch, HBASE-13084_2.patch, HBASE-13084_2.patch, 
> HBASE-13084_2.patch
>
>
> As discussed in HBASE-12953, we found this error in PreCommit log
> https://builds.apache.org/job/PreCommit-HBASE-Build/12918/artifact/hbase-shell/target/surefire-reports/org.apache.hadoop.hbase.client.TestShell-output.txt
> {noformat}
>   1) Error:
> test_The_get/put_methods_should_work_for_data_written_with_Visibility(Hbase::VisibilityLabelsAdminMethodsTest):
> ArgumentError: org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.security.visibility.InvalidLabelException: Label 
> 'TEST_VISIBILITY' doesn't exists
>       at 
> org.apache.hadoop.hbase.security.visibility.VisibilityController.setAuths(VisibilityController.java:808)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.setAuths(VisibilityLabelsProtos.java:6036)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:6219)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6867)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1707)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1689)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31309)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2038)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>       at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>       at java.lang.Thread.run(Thread.java:744)
>     
> /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:84:in
>  `set_auths'
>     ./src/test/ruby/hbase/visibility_labels_admin_test.rb:77:in 
> `test_The_get/put_methods_should_work_for_data_written_with_Visibility'
>     org/jruby/RubyProc.java:270:in `call'
>     org/jruby/RubyKernel.java:2105:in `send'
>     org/jruby/RubyArray.java:1620:in `each'
>     org/jruby/RubyArray.java:1620:in `each'
>   2) Error:
> test_The_set/clear_methods_should_work_with_authorizations(Hbase::VisibilityLabelsAdminMethodsTest):
> ArgumentError: No authentication set for the given user jenkins
>     
> /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:97:in
>  `get_auths'
>     ./src/test/ruby/hbase/visibility_labels_admin_test.rb:57:in 
> `test_The_set/clear_methods_should_work_with_authorizations'
>     org/jruby/RubyProc.java:270:in `call'
>     org/jruby/RubyKernel.java:2105:in `send'
>     org/jruby/RubyArray.java:1620:in `each'
>     org/jruby/RubyArray.java:1620:in `each'
> {noformat}
> This is the test code
> {code:title=visibility_labels_admin_test.rb}
>       label = 'TEST_VISIBILITY'
>       user = org.apache.hadoop.hbase.security.User.getCurrent().getName();
>       visibility_admin.add_labels(label)
>       visibility_admin.set_auths(user, label)
> {code}
> It says 'label does not exists' when calling set_auths.
> Then I add some ugly logs in  DefaultVisibilityLabelServiceImpl and 
> VisibilityLabelsCache.
> {code:title=DefaultVisibilityLabelServiceImpl.java}
>   public OperationStatus[] addLabels(List<byte[]> labels) throws IOException {
>     ...
>     if (mutateLabelsRegion(puts, finalOpStatus)) {
>       updateZk(true);
>     }
>     for (byte[] label : labels) {
>       String labelStr = Bytes.toString(label);
>       LOG.info(labelStr + "=====" + 
> this.labelsCache.getLabelOrdinal(labelStr));
>     }
>     ...
>   }
> {code}
> {code:title=VisibilityLabelsCache.java}
>   public void refreshLabelsCache(byte[] data) throws IOException {
>     LOG.info("========refresh", new Exception());
>     ...
>   }
> {code}
> And I modified TestVisibilityLabelsWithCustomVisLabService to use 
> DefaultVisibilityLabelServiceImpl, then collected the logs of setupBeforeClass
> {noformat}
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): secret=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): topsecret=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): confidential=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): public=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): private=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): ©ABC=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): ु=====0
> 2015-02-21 20:39:16,362 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): ©ABC"ु\secret"'&\=====0
> 2015-02-21 20:39:16,363 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): '"+=====0
> 2015-02-21 20:39:16,363 INFO  
> [B.defaultRpcServer.handler=0,queue=0,port=42678] 
> visibility.DefaultVisibilityLabelServiceImpl(236): -?=====0
> 2015-02-21 20:39:16,363 INFO  [main-EventThread] 
> visibility.VisibilityLabelsCache(115): ========refresh
> java.lang.Exception
>       at 
> org.apache.hadoop.hbase.security.visibility.VisibilityLabelsCache.refreshLabelsCache(VisibilityLabelsCache.java:115)
>       at 
> org.apache.hadoop.hbase.security.visibility.ZKVisibilityLabelWatcher.refreshVisibilityLabelsCache(ZKVisibilityLabelWatcher.java:76)
>       at 
> org.apache.hadoop.hbase.security.visibility.ZKVisibilityLabelWatcher.nodeDataChanged(ZKVisibilityLabelWatcher.java:115)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:412)
>       at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>       at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> We can see that, refresh is triggered from a background zk watcher, and it is 
> possible that the labels are not in labelsCache after addLabels being called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to