[jira] [Resolved] (KUDU-2912) Document zero-downtime workflow for 'forgetting' dead tservers

Alexey Serbin (Jira) Tue, 12 Sep 2023 18:57:07 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexey Serbin resolved KUDU-2912.
---------------------------------
    Fix Version/s: n/a
       Resolution: Information Provided

> Document zero-downtime workflow for 'forgetting' dead tservers
> --------------------------------------------------------------
>
>                 Key: KUDU-2912
>                 URL: https://issues.apache.org/jira/browse/KUDU-2912
>             Project: Kudu
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 1.11.0
>            Reporter: Adar Dembo
>            Priority: Major
>             Fix For: n/a
>
>
> This is a fairly useful workflow when the goal is to rebalance the cluster. 
> All it takes is one dead tserver (supposing it's decommissioned and long 
> gone) for rebalancing to refuse to run. As of 1.10.0 there's a CLI parameter 
> that instructs the rebalancer to ignore certain tservers, but it's annoying 
> to put together a UUID list when multiple tservers are dead.
> Anyway, the zero-downtime workflow is:
> # Restart all of the masters in the cluster one by one.
> # After each restart, wait for the restarted master to load its tablet and 
> join consensus (ksck should be able to indicate when this was achieved).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KUDU-2912) Document zero-downtime workflow for 'forgetting' dead tservers

Reply via email to