[ 
https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475015#comment-15475015
 ] 

Enis Soztutar commented on PHOENIX-3072:
----------------------------------------

bq. On the RS, we already make index table updates higher priority than data 
table updates
This happens on the region open, and does not involve the RPC scheduling. In a 
cluster restart, all of the index and data table regions will be opened by the 
regionservers. There is only 3 threads that does the opening of regions by 
default, and for the data tables, the opening of the region blocks on doing the 
index updates. However, if the index regions are not opened yet, then they will 
not succeed even if the regionserver RPC scheduling works. The index regions 
will be waiting on the same "region opening queue" to be opened by the same 
regionserver. 
bq. Also, would you mind generating a patch that ignores whitespace changes as 
it's difficult to find the change you've made.
Sorry, the existing code is full with extra whitespace, and my Eclipse settings 
is to truncate these as a save action. This is to make sure that my patches do 
not introduce any more extra whitespaces. I can put the patch in RB/github if 
you want. 

> Deadlock on region opening with secondary index recovery
> --------------------------------------------------------
>
>                 Key: PHOENIX-3072
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3072
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 4.9.0, 4.8.1
>
>         Attachments: phoenix-3072_v1.patch
>
>
> There is a distributed deadlock happening in clusters with some moderate 
> number of regions for the data tables and secondary index tables and cluster 
> and it is cluster restart or some large failure. We have seen this in a 
> couple of production cases already. 
> Opening of regions in hbase is performed by a thread pool with 3 threads by 
> default. Every regionserver can open 3 regions at a time. However, opening 
> data table regions has to write to multiple index regions during WAL 
> recovery. All other region open requests are queued up in a single queue. 
> This causes a deadlock, since the secondary index regions are also opened by 
> the same thread pools that we do the work. So if there is greater number of 
> data table regions then available number of region opening threads from 
> regionservers, the secondary index region open requests just wait to be 
> processed in the queue. Since these index regions are not open, the region 
> opening of data table regions just block the region opening threads for a 
> long time.  
> One proposed fix is to use a different thread pool for opening regions of the 
> secondary index tables so that we will not deadlock. See HBASE-16095 for the 
> HBase-level fix. In Phoenix, we just have to set the priority for secondary 
> index tables. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to