[ https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236337#comment-16236337 ]
Andrew Purtell commented on HBASE-19144: ---------------------------------------- bq. This was just plain wrong before, right (nothing to do with this change)? Yes, just wrong. I'm fixing it here because I need this same conditional later, so making it right in all uses. > [RSgroups] Retry assignments in FAILED_OPEN state when servers (re)join the > cluster > ----------------------------------------------------------------------------------- > > Key: HBASE-19144 > URL: https://issues.apache.org/jira/browse/HBASE-19144 > Project: HBase > Issue Type: Bug > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Major > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.5.0 > > Attachments: HBASE-19144-branch-1.patch, HBASE-19144.patch > > > After all servers in the RSgroup are down the regions cannot be opened > anywhere and transition rapidly into FAILED_OPEN state. > > 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: > Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, > server=node-5.cluster,16020,1509482700768} to > {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, > server=node-5.cluster,16020,1509482700768} > 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: > Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on > node-5.cluster,16020,1509482700768, set to FAILED_OPEN > > Any region in FAILED_OPEN state has to be manually reassigned, or the master > can be restarted and this will also cause reattempt of assignment of any > regions in FAILED_OPEN state. This is not unexpected but is an operational > headache. It would be better if the RSGroupInfoManager could automatically > kick reassignments of regions in FAILED_OPEN state when servers rejoin the > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)