[ 
https://issues.apache.org/jira/browse/HBASE-26298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421609#comment-17421609
 ] 

Viraj Jasani commented on HBASE-26298:
--------------------------------------

{quote}Part of me almost wonders if it makes sense to create a "rolling 
[upgrade | downgrade] [start | finalize]" command to more clearly formalize the 
handling of system tables (and maybe other features) during these operations.
{quote}
Yes, something similar to what HDFS provides would be great, but we will have 
to brainstorm all cases to handle as part of upgrade and downgrade. 

[~bbeaudreault] Please feel free to update any docs around any configs if you 
feel that makes the description more user friendly, would be really great help! 
Perhaps we can track the work as part of this Jira as we discover any such gaps.

> Downgrading is complicated by refusal to assign system tables to lower version
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-26298
>                 URL: https://issues.apache.org/jira/browse/HBASE-26298
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Priority: Minor
>
> I was doing some rolling downgrades of test clusters and keep getting into a 
> state where my automation gets stuck trying to drain the final RegionServer 
> in the cluster. At this point that RegionServer hosts 3 regions: meta, quota, 
> namespace. The HMaster is outputting logs like: "Passed destination 
> servername is null/empty so choosing a server at random".
> I's very hard to understand what's happening based on that log, so you really 
> have to look at the code. Tracking down that log line, it becomes somewhat 
> clear that you are getting trapped by 
> AssignmentManager.getExcludedServersForSystemTable().
> Looking at the code, you can see comments related to 
> "hbase.min.version.move.system.tables" config, but the comments are very 
> unclear. What should I set this to?
> This setting was added in https://issues.apache.org/jira/browse/HBASE-22923 
> which focuses mostly on RSGroup, but this issue is affecting clusters that do 
> not use RSGroup. The release note also is not super clear.
> It would be great to clarify the docs to help the operator know what to 
> change this to, or perhaps make the config itself more intuitive. For 
> example, could we just make it an allowlist of versions that can hold system 
> tables? At that point my path is clear: add the version I'm downgrading to to 
> the allowlist.
> This issue is also exacerbated by the fact that by the time you've realized 
> this you're in a somewhat tricky situation where there's only 1 RegionServer 
> left and your only way around it is to force stop it or to push a new config 
> and rolling restart your HMasters. It would be great if this setting were 
> able to be updated via Admin or at the very least reloadable with 
> ConfigurationObserver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to