[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery

James Taylor (JIRA) Thu, 08 Sep 2016 14:33:47 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475067#comment-15475067
 ]


James Taylor commented on PHOENIX-3072:
---------------------------------------

If we can get in shape for 4.8.1, that'd be great, [~enis]. I agree, it seems 
important. Some questions/comments:
- It's difficult to tell what's changed with all the whitespace diffs. Can you 
generate a patch without that?
- It looks like you're setting a new "PRIORITY" attribute on table descriptor 
for indexes? How/where is this used?
- How will you handle local indexes since the table descriptor is the same data 
and index table? Should we add it as a column descriptor attribute instead, or 
would we not know which column families are involved when we're using this info?
- Minor nit: is I suppose you're not using the HBase static constant for 
"PRIORITY" because this doesn't appear until HBase 1.3? Maybe we should define 
one in QueryConstants with a comment?
- Didn't priority get exposed as an attribute on operations now? If so, would 
that be an alternate implementation mechanism which is a bit more flexible?
- What about existing tables and indexes - I didn't see any upgrade code that 
sets this for those. If setting priority on operation is an option, that'd get 
around this.

> Deadlock on region opening with secondary index recovery
> --------------------------------------------------------
>
>                 Key: PHOENIX-3072
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3072
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 4.9.0, 4.8.1
>
>         Attachments: phoenix-3072_v1.patch
>
>
> There is a distributed deadlock happening in clusters with some moderate 
> number of regions for the data tables and secondary index tables and cluster 
> and it is cluster restart or some large failure. We have seen this in a 
> couple of production cases already. 
> Opening of regions in hbase is performed by a thread pool with 3 threads by 
> default. Every regionserver can open 3 regions at a time. However, opening 
> data table regions has to write to multiple index regions during WAL 
> recovery. All other region open requests are queued up in a single queue. 
> This causes a deadlock, since the secondary index regions are also opened by 
> the same thread pools that we do the work. So if there is greater number of 
> data table regions then available number of region opening threads from 
> regionservers, the secondary index region open requests just wait to be 
> processed in the queue. Since these index regions are not open, the region 
> opening of data table regions just block the region opening threads for a 
> long time.  
> One proposed fix is to use a different thread pool for opening regions of the 
> secondary index tables so that we will not deadlock. See HBASE-16095 for the 
> HBase-level fix. In Phoenix, we just have to set the priority for secondary 
> index tables. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery

Reply via email to