[ https://issues.apache.org/jira/browse/KUDU-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin resolved KUDU-2912. --------------------------------- Fix Version/s: n/a Resolution: Information Provided > Document zero-downtime workflow for 'forgetting' dead tservers > -------------------------------------------------------------- > > Key: KUDU-2912 > URL: https://issues.apache.org/jira/browse/KUDU-2912 > Project: Kudu > Issue Type: Bug > Components: documentation > Affects Versions: 1.11.0 > Reporter: Adar Dembo > Priority: Major > Fix For: n/a > > > This is a fairly useful workflow when the goal is to rebalance the cluster. > All it takes is one dead tserver (supposing it's decommissioned and long > gone) for rebalancing to refuse to run. As of 1.10.0 there's a CLI parameter > that instructs the rebalancer to ignore certain tservers, but it's annoying > to put together a UUID list when multiple tservers are dead. > Anyway, the zero-downtime workflow is: > # Restart all of the masters in the cluster one by one. > # After each restart, wait for the restarted master to load its tablet and > join consensus (ksck should be able to indicate when this was achieved). -- This message was sent by Atlassian Jira (v8.20.10#820010)