[ https://issues.apache.org/jira/browse/HBASE-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403193#comment-15403193 ]
Duo Zhang commented on HBASE-16209: ----------------------------------- And in my experience, there is no NPE. The problem is that the when moving a region, the region is not online quick enough after offline, this causes a lot of UT to fail. I found the problem is that we call invokeAssignLater in ClosedRegionHandler, this add a delay before assigning the region to a new place. The original implementation is to call assign directly. Any reasons why we should change it to invokeLater? Thanks very much. > Provide an ExponentialBackOffPolicy sleep between failed region open requests > ----------------------------------------------------------------------------- > > Key: HBASE-16209 > URL: https://issues.apache.org/jira/browse/HBASE-16209 > Project: HBase > Issue Type: Bug > Reporter: Joseph > Assignee: Joseph > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-16209-addendum.patch, > HBASE-16209-branch-1-addendum-v2.patch, HBASE-16209-branch-1-addendum.patch, > HBASE-16209-branch-1-v3.patch, HBASE-16209-branch-1.patch, > HBASE-16209-v2.patch, HBASE-16209.patch > > > Related to HBASE-16138. As of now we currently have no pause between retrying > failed region open requests. And with a low maximumAttempt default, we can > quickly use up all our regionOpen retries if the server is in a bad state. I > added in a ExponentialBackOffPolicy so that we spread out the timing of our > open region retries in AssignmentManager. Review board at > https://reviews.apache.org/r/50011/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)