[ https://issues.apache.org/jira/browse/HBASE-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057897#comment-16057897 ]
Hadoop QA commented on HBASE-18244: ----------------------------------- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 35s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 26s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha3. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 54s {color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 6s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 41m 4s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12873901/HBASE-18244.001.patch | | JIRA Issue | HBASE-18244 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 058c6c72262e 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 83be50c | | Default Java | 1.8.0_131 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7280/testReport/ | | modules | C: hbase-shell U: hbase-shell | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7280/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > org.apache.hadoop.hbase.client.rsgroup.TestShellRSGroups hangs/fails > -------------------------------------------------------------------- > > Key: HBASE-18244 > URL: https://issues.apache.org/jira/browse/HBASE-18244 > Project: HBase > Issue Type: Bug > Components: test > Reporter: Josh Elser > Assignee: Stephen Yuan Jiang > Fix For: 2.0.0 > > Attachments: HBASE-18244.001.patch > > > Sometime in the past couple of weeks, TestShellRSGroups has started > timing-out/failing for me. > It will get stuck on a call to moveTables() > {noformat} > "main" #1 prio=5 os_prio=31 tid=0x00007ff012004800 nid=0x1703 in > Object.wait() [0x000070000020d000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at > org.apache.hadoop.hbase.ipc.BlockingRpcCallback.get(BlockingRpcCallback.java:62) > - locked <0x000000078d1003f0> (a > org.apache.hadoop.hbase.ipc.BlockingRpcCallback) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:328) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:94) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:567) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$BlockingStub.execMasterService(MasterProtos.java) > at > org.apache.hadoop.hbase.client.ConnectionImplementation$3.execMasterService(ConnectionImplementation.java:1500) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67$1.rpcCall(HBaseAdmin.java:2991) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67$1.rpcCall(HBaseAdmin.java:2986) > at > org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:98) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67.callExecService(HBaseAdmin.java:2997) > at > org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callBlockingMethod(SyncCoprocessorRpcChannel.java:69) > at > org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService$BlockingStub.moveTables(RSGroupAdminProtos.java:13171) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminClient.moveTables(RSGroupAdminClient.java:117) > {noformat} > The server-side end of the RPC is waiting on a procedure to finish: > {noformat} > "RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242" #289 daemon > prio=5 os_prio=31 tid=0x00007ff015b7c000 nid=0x1e603 waiting on condition > [0x000070000dbc9000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:184) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:171) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:141) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:130) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:123) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.unassign(AssignmentManager.java:478) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.unassign(AssignmentManager.java:465) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:432) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveTables(RSGroupAdminEndpoint.java:174) > at > org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:12786) > at > org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:673) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:258) > Locked ownable synchronizers: > - None > {noformat} > I don't see anything else running in the thread dump, but I do see that meta > was closed. Eventually the client (I think) gives up and retries, but then > fails because meta isn't assigned. > {noformat} > 2017-06-20 15:07:45,720 DEBUG [RS_CLOSE_META-hw10447:64242-0] > assignment.RegionTransitionProcedure(200): Received report CLOSED seqId=-1, > pid=12, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=hbase:meta, region=1588230740, > server=hw10447.local,64242,1497985650605; rit=CLOSING, > location=hw10447.local,64242,1497985650605 > 2017-06-20 15:07:45,720 DEBUG [RS_CLOSE_META-hw10447:64242-0] > handler.CloseRegionHandler(122): Closed hbase:meta,,1.1588230740 > 2017-06-20 15:07:45,722 WARN [ProcExecWrkr-9] > zookeeper.MetaTableLocator(443): Tried to set null ServerName in hbase:meta; > skipping -- ServerName required > 2017-06-20 15:07:45,724 INFO [ProcExecWrkr-9] > procedure2.ProcedureExecutor(1167): Finished pid=12, state=SUCCESS; > UnassignProcedure table=hbase:meta, region=1588230740, > server=hw10447.local,64242,1497985650605 in 379msec > 2017-06-20 15:07:46,454 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > procedure.ProcedureSyncWait(192): waitFor pid=12 > 2017-06-20 15:07:47,458 INFO > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > rsgroup.RSGroupAdminServer(345): Unassigning 0 region(s) from > hw10447.local:64242 for server move to test_group > 2017-06-20 15:07:48,462 INFO > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > rsgroup.RSGroupAdminServer(378): Move server done: default=>test_group > Took 4.2749 seconds2017-06-20 15:07:48,488 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > rsgroup.RSGroupInfoManagerImpl(480): Updating znode: /hbase/rsgroup/test_group > 2017-06-20 15:07:48,488 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > rsgroup.RSGroupInfoManagerImpl(480): Updating znode: /hbase/rsgroup/default > 2017-06-20 15:07:48,488 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > rsgroup.RSGroupInfoManagerImpl(486): Writing ZK GroupInfo count: 4 > 2017-06-20 15:07:48,598 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > procedure2.ProcedureExecutor(792): Stored pid=13, state=RUNNABLE; > org.apache.hadoop.hbase.master.locking.LockProcedure, tableName=test_table, > type=EXCLUSIVE > 2017-06-20 15:07:48,600 DEBUG [ProcExecWrkr-10] locking.LockProcedure(313): > LOCKED pid=13, state=RUNNABLE; > org.apache.hadoop.hbase.master.locking.LockProcedure, tableName=test_table, > type=EXCLUSIVE > 2017-06-20 15:07:48,600 INFO [ProcExecWrkr-10] > procedure2.ProcedureExecutor$TimeoutExecutorThread(1743): ADDED pid=13, > state=WAITING_TIMEOUT; org.apache.hadoop.hbase.master.locking.LockProcedure, > tableName=test_table, type=EXCLUSIVE; timeout=600000, timestamp=1497986268600 > 2017-06-20 15:07:48,706 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > procedure2.ProcedureExecutor(792): Stored pid=14, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=test_table, region=578de877b09db11dcc129a6ff5b22406, > server=hw10447.local,64263,1497985652273 > 2017-06-20 15:07:49,179 WARN [HBase-Metrics2-1] impl.MetricsConfig(125): > Cannot locate configuration: tried > hadoop-metrics2-hbase.properties,hadoop-metrics2.properties > 2017-06-20 15:07:49,708 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242] > procedure.ProcedureSyncWait(192): waitFor pid=14 > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > AssignProcedure table=hbase:meta, region=1588230740 pid=1, > owner=jelser.hfs.3, state=SUCCESS, submittedTime=29sec ago, lastUpdate=29sec > ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > CreateTableProcedure table=hbase:namespace pid=2, owner=jelser.hfs.3, > state=SUCCESS, submittedTime=27sec ago, lastUpdate=27sec ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > CreateNamespaceProcedure, namespace=default pid=4, owner=jelser.hfs.3, > state=SUCCESS, submittedTime=27sec ago, lastUpdate=27sec ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > CreateTableProcedure table=hbase:rsgroup pid=5, owner=jelser.hfs.3, > state=SUCCESS, submittedTime=27sec ago, lastUpdate=27sec ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > CreateNamespaceProcedure, namespace=hbase pid=7, owner=jelser.hfs.3, > state=SUCCESS, submittedTime=26sec ago, lastUpdate=26sec ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > UnassignProcedure table=hbase:namespace, > region=a47624f5345a80b72f278f79a432955b, > server=hw10447.local,64242,1497985650605 pid=11, owner=jelser, state=SUCCESS, > submittedTime=18sec ago, lastUpdate=18sec ago > 2017-06-20 15:08:02,376 DEBUG [ProcExecTimeout] > procedure2.ProcedureExecutor$CompletedProcedureCleaner(190): Evict completed > UnassignProcedure table=hbase:meta, region=1588230740, > server=hw10447.local,64242,1497985650605 pid=12, owner=jelser, state=SUCCESS, > submittedTime=17sec ago, lastUpdate=17sec ago > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)