[
https://issues.apache.org/jira/browse/HBASE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707821#comment-14707821
]
stack commented on HBASE-13480:
-------------------------------
All my priority rpc handlers are in a version of this:
{code}
5673 "PriorityRpcServer.handler=0,queue=0,port=57983" daemon prio=5
tid=0x00007fb77bd09800 nid=0xf40b in Object.wait() [0x0000000125863000]
5674 java.lang.Thread.State: TIMED_WAITING (on object monitor)
5675 › at java.lang.Object.wait(Native Method)
5676 › - waiting on <0x00000007fdd9c700> (a
org.apache.hadoop.hbase.ipc.AsyncCall)
5677 › at java.lang.Object.wait(Object.java:461)
5678 › at
io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:355)
5679 › - locked <0x00000007fdd9c700> (a org.apache.hadoop.hbase.ipc.AsyncCall)
5680 › at
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:266)
5681 › at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:42)
5682 › at
org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:232)
5683 › at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:214)
5684 › at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:288)
5685 › at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:32876)
5686 › at org.apache.hadoop.hbase.client.HTable$1.call(HTable.java:442)
5687 › at org.apache.hadoop.hbase.client.HTable$1.call(HTable.java:433)
5688 › at
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:118)
5689 › at org.apache.hadoop.hbase.client.HTable.get(HTable.java:450)
5690 › at org.apache.hadoop.hbase.client.HTable.get(HTable.java:416)
5691 › at
org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1075)
5692 › at
org.apache.hadoop.hbase.master.TableStateManager.readMetaState(TableStateManager.java:187)
5693 › at
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:171)
5694 › at
org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:130)
5695 › at
org.apache.hadoop.hbase.master.AssignmentManager.onRegionOpen(AssignmentManager.java:2255)
5696 › at
org.apache.hadoop.hbase.master.AssignmentManager.onRegionTransition(AssignmentManager.java:2826)
5697 › at
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1358)
5698 › at
org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8623)
5699 › at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133)
5700 › at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
5701 › at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
5702 › at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
5703 › at java.lang.Thread.run(Thread.java:744)
{code}
> ShortCircuitConnection doesn't short-circuit all calls as expected
> ------------------------------------------------------------------
>
> Key: HBASE-13480
> URL: https://issues.apache.org/jira/browse/HBASE-13480
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 1.0.0, 2.0.0, 1.1.0
> Reporter: Josh Elser
> Assignee: Jingcheng Du
> Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.0.3, 1.1.3
>
> Attachments: HBASE-13480-1.patch, HBASE-13480.patch
>
>
> Noticed the following situation in debugging unexpected unit tests failures
> in HBASE-13351.
> {{ConnectionUtils#createShortCircuitHConnection(Connection, ServerName,
> AdminService.BlockingInterface, ClientService.BlockingInterface)}} is
> intended to avoid the extra RPC by calling the server's instantiation of the
> protobuf rpc stub directly for the AdminService and ClientService.
> The problem is that this is insufficient to actually avoid extra "remote"
> RPCs as all other calls to the Connection are routed to a "real" Connection
> instance. As such, any object created by the "real" Connection (such as an
> HTable) will use the real Connection, not the SSC.
> The end result is that
> {{MasterRpcService#reportRegionStateTransition(RpcController,
> ReportRegionStateTransitionRequest)}} will make additional "remote" RPCs over
> what it thinks is an SSC through a {{Get}} on {{HTable}} which was
> constructed using the SSC, but the {{Get}} itself will use the underlying
> real Connection instead of the SSC. With insufficiently sized thread pools,
> this has been observed to result in RPC deadlock in the HMaster where an RPC
> attempts to make another RPC but there are no more threads available to
> service the second RPC so the first RPC blocks indefinitely.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)