[ https://issues.apache.org/jira/browse/PHOENIX-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067544#comment-16067544 ]
Samarth Jain commented on PHOENIX-3983: --------------------------------------- The region server hosting system catalog issues index rebuild scans against the data table region servers. If the ServerRpcControllerFactory is configured on the region servers, then these scan RPCs have their priority set to the INDEX priority which results in these RPC calls being handled on the destination servers by the INDEX handlers. In turn, these index handlers are used to do local writes to the data table which then trigger remote RPCs to the index tables. These RPCs are then again handled by the index handlers on the region servers hosting index table regions. This can result in a deadlock. Consider this simple scenario: Two region server setup. RS1 - SYSTEM.CATALOG RS2 - DATA_TABLE, INDEX_TABLE RS3 - DATA_TABLE, INDEX_TABLE For simplicity lets assume that number of index rpc handlers is 1. Let's name the lone handler as T1 on RS2 and T1' on RS3. Number of regular rpc handlers - 1 RS1 -> issues a scan on data table region servers. These scans are then handled on RS2 by T1 and RS3 by T1' The index handler T1 on RS2 and T1' on RS3 then write locally to their data table regions which results in remote RPCs to RS3 and RS2 respectively. RPC from RS3 to RS2 is not able to proceed because the index handler T1 on RS2 that could service this call is waiting on it's RPC to RS3 to finish. RPC from RS2 to RS3 is not able to proceed because the index handler T1' on RS3 that could service this call is waiting on it's RPC to RS2 to finish. Deadlock. The fix is to *unset* the server rpc controller factory so that the scans happening on data table region servers are handled by DefaultRPCHandler s and *not* IndexRPCHandlers. Many thanks to [~vincentpoon] for his help in debugging and identifying the issue. FYI, [~lhofhansl]. > Index rebuild scans should not be using the ServerRpcControllerFactory > ---------------------------------------------------------------------- > > Key: PHOENIX-3983 > URL: https://issues.apache.org/jira/browse/PHOENIX-3983 > Project: Phoenix > Issue Type: Bug > Reporter: Samarth Jain > Assignee: Samarth Jain > -- This message was sent by Atlassian JIRA (v6.4.14#64029)