zhangduo created HBASE-13084: -------------------------------- Summary: Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey Key: HBASE-13084 URL: https://issues.apache.org/jira/browse/HBASE-13084 Project: HBase Issue Type: Bug Components: test Reporter: zhangduo
As discussed in HBASE-12953, we found this error in PreCommit log https://builds.apache.org/job/PreCommit-HBASE-Build/12918/artifact/hbase-shell/target/surefire-reports/org.apache.hadoop.hbase.client.TestShell-output.txt {noformat} 1) Error: test_The_get/put_methods_should_work_for_data_written_with_Visibility(Hbase::VisibilityLabelsAdminMethodsTest): ArgumentError: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.security.visibility.InvalidLabelException: Label 'TEST_VISIBILITY' doesn't exists at org.apache.hadoop.hbase.security.visibility.VisibilityController.setAuths(VisibilityController.java:808) at org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.setAuths(VisibilityLabelsProtos.java:6036) at org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:6219) at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6867) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1707) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1689) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31309) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2038) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:744) /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:84:in `set_auths' ./src/test/ruby/hbase/visibility_labels_admin_test.rb:77:in `test_The_get/put_methods_should_work_for_data_written_with_Visibility' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyKernel.java:2105:in `send' org/jruby/RubyArray.java:1620:in `each' org/jruby/RubyArray.java:1620:in `each' 2) Error: test_The_set/clear_methods_should_work_with_authorizations(Hbase::VisibilityLabelsAdminMethodsTest): ArgumentError: No authentication set for the given user jenkins /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:97:in `get_auths' ./src/test/ruby/hbase/visibility_labels_admin_test.rb:57:in `test_The_set/clear_methods_should_work_with_authorizations' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyKernel.java:2105:in `send' org/jruby/RubyArray.java:1620:in `each' org/jruby/RubyArray.java:1620:in `each' {noformat} This is the test code {code:title=visibility_labels_admin_test.rb} label = 'TEST_VISIBILITY' user = org.apache.hadoop.hbase.security.User.getCurrent().getName(); visibility_admin.add_labels(label) visibility_admin.set_auths(user, label) {code} It says 'label does not exists' when calling set_auths. Then I add some ugly logs in DefaultVisibilityLabelServiceImpl and VisibilityLabelsCache. {code:title=DefaultVisibilityLabelServiceImpl.java} public OperationStatus[] addLabels(List<byte[]> labels) throws IOException { ... if (mutateLabelsRegion(puts, finalOpStatus)) { updateZk(true); } for (byte[] label : labels) { String labelStr = Bytes.toString(label); LOG.info(labelStr + "=====" + this.labelsCache.getLabelOrdinal(labelStr)); } ... } {code} {code:title=VisibilityLabelsCache.java} public void refreshLabelsCache(byte[] data) throws IOException { LOG.info("========refresh", new Exception()); ... } {code} And I modified TestVisibilityLabelsWithCustomVisLabService to use DefaultVisibilityLabelServiceImpl, then collected the logs of setupBeforeClass {noformat} 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): secret=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): topsecret=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): confidential=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): public=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): private=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): ©ABC=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): ु=====0 2015-02-21 20:39:16,362 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): ©ABC"ु\secret"'&\=====0 2015-02-21 20:39:16,363 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): '"+=====0 2015-02-21 20:39:16,363 INFO [B.defaultRpcServer.handler=0,queue=0,port=42678] visibility.DefaultVisibilityLabelServiceImpl(236): -?=====0 2015-02-21 20:39:16,363 INFO [main-EventThread] visibility.VisibilityLabelsCache(115): ========refresh java.lang.Exception at org.apache.hadoop.hbase.security.visibility.VisibilityLabelsCache.refreshLabelsCache(VisibilityLabelsCache.java:115) at org.apache.hadoop.hbase.security.visibility.ZKVisibilityLabelWatcher.refreshVisibilityLabelsCache(ZKVisibilityLabelWatcher.java:76) at org.apache.hadoop.hbase.security.visibility.ZKVisibilityLabelWatcher.nodeDataChanged(ZKVisibilityLabelWatcher.java:115) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:412) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) {noformat} We can see that, refresh is triggered from a background zk watcher, and it is possible that the labels are not in labelsCache after addLabels being called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)