[ 
https://issues.apache.org/jira/browse/HBASE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707821#comment-14707821
 ] 

stack commented on HBASE-13480:
-------------------------------

All my priority rpc handlers are in a version of this:

{code}
5673 "PriorityRpcServer.handler=0,queue=0,port=57983" daemon prio=5 
tid=0x00007fb77bd09800 nid=0xf40b in Object.wait() [0x0000000125863000]
5674    java.lang.Thread.State: TIMED_WAITING (on object monitor)
5675 ›   at java.lang.Object.wait(Native Method)
5676 ›   - waiting on <0x00000007fdd9c700> (a 
org.apache.hadoop.hbase.ipc.AsyncCall)
5677 ›   at java.lang.Object.wait(Object.java:461)
5678 ›   at 
io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:355)
5679 ›   - locked <0x00000007fdd9c700> (a org.apache.hadoop.hbase.ipc.AsyncCall)
5680 ›   at 
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:266)
5681 ›   at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:42)
5682 ›   at 
org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:232)
5683 ›   at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:214)
5684 ›   at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:288)
5685 ›   at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:32876)
5686 ›   at org.apache.hadoop.hbase.client.HTable$1.call(HTable.java:442)
5687 ›   at org.apache.hadoop.hbase.client.HTable$1.call(HTable.java:433)
5688 ›   at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:118)
5689 ›   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:450)
5690 ›   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:416)
5691 ›   at 
org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1075)
5692 ›   at 
org.apache.hadoop.hbase.master.TableStateManager.readMetaState(TableStateManager.java:187)
5693 ›   at 
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:171)
5694 ›   at 
org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:130)
5695 ›   at 
org.apache.hadoop.hbase.master.AssignmentManager.onRegionOpen(AssignmentManager.java:2255)
5696 ›   at 
org.apache.hadoop.hbase.master.AssignmentManager.onRegionTransition(AssignmentManager.java:2826)
5697 ›   at 
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1358)
5698 ›   at 
org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8623)
5699 ›   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133)
5700 ›   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
5701 ›   at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
5702 ›   at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
5703 ›   at java.lang.Thread.run(Thread.java:744)   
{code}

> ShortCircuitConnection doesn't short-circuit all calls as expected
> ------------------------------------------------------------------
>
>                 Key: HBASE-13480
>                 URL: https://issues.apache.org/jira/browse/HBASE-13480
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 1.0.0, 2.0.0, 1.1.0
>            Reporter: Josh Elser
>            Assignee: Jingcheng Du
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.0, 1.2.1, 1.0.3, 1.1.3
>
>         Attachments: HBASE-13480-1.patch, HBASE-13480.patch
>
>
> Noticed the following situation in debugging unexpected unit tests failures 
> in HBASE-13351.
> {{ConnectionUtils#createShortCircuitHConnection(Connection, ServerName, 
> AdminService.BlockingInterface, ClientService.BlockingInterface)}} is 
> intended to avoid the extra RPC by calling the server's instantiation of the 
> protobuf rpc stub directly for the AdminService and ClientService.
> The problem is that this is insufficient to actually avoid extra "remote" 
> RPCs as all other calls to the Connection are routed to a "real" Connection 
> instance. As such, any object created by the "real" Connection (such as an 
> HTable) will use the real Connection, not the SSC.
> The end result is that 
> {{MasterRpcService#reportRegionStateTransition(RpcController, 
> ReportRegionStateTransitionRequest)}} will make additional "remote" RPCs over 
> what it thinks is an SSC through a {{Get}} on {{HTable}} which was 
> constructed using the SSC, but the {{Get}} itself will use the underlying 
> real Connection instead of the SSC. With insufficiently sized thread pools, 
> this has been observed to result in RPC deadlock in the HMaster where an RPC 
> attempts to make another RPC but there are no more threads available to 
> service the second RPC so the first RPC blocks indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to