Alexey Serbin created KUDU-2247:
-----------------------------------

             Summary: Update doc on the 'kudu tablet change_config 
move_replica' after 3-4-3 enabled by default
                 Key: KUDU-2247
                 URL: https://issues.apache.org/jira/browse/KUDU-2247
             Project: Kudu
          Issue Type: Task
          Components: documentation
    Affects Versions: 1.7
            Reporter: Alexey Serbin
            Assignee: Alexey Serbin


The way how the replica replacement works in 3-4-3 v1 design scheme has a few 
corner cases for very specific run-time scenarios.  That's due to the absence 
of the {{SUPERCEDES}} attribute, which is in the full 3-4-3 design proposal 
(option E), but not in 3-4-3 v1 design.

* Initial configuration is {{[ A(V:+:), B(V:+:), C(V:+:) ]}}
* A voter replica {{A}} marked with the {{REPLACE}} attribute: {{[ 
A(V:+:REPLACE=true), B(V:+:), C(V:+:) ]}}
* A non-voter replica {{X}} is added to replace replica {{A}}: {{[ 
A(V:+:REPLACE=true), B(V:+:), C(V:+:), X(N:+:PROMOTE=true) ]}}
* Replica {{B}} fails, so the system adds another non-voter replica {{Y}} to 
replace the failed replica: {{[ A(V:+:REPLACE=true), B(V:-:), C(V:+:), 
X(N:+:PROMOTE=true), Y(N:+:PROMOTE=true) ]}}
* After some time, before replica tablet copying is complete for either of two 
replicas {{X}} or {{Y}}, replica {{B}} is back, so the system evicts replica 
{{X}}: {{[ A(V:+:REPLACE=true), B(V:+:), C(V:+:), Y(N:+:PROMOTE=true) ]}}
* Eventually, replica {{Y}} completes copying the data, catches up with the 
leader and is promoted by the leader replica: {{[ A(V:+:REPLACE=true), B(V:+:), 
C(V:+:), Y(V:+:) ]}}
* Next step is removing replica {{A}}, so the result configuration is {{[ 
B(V:+:), C(V:+:), Y(V:+:) ]}} instead of the expected {{[ B(V:+:), C(V:+:), 
X(V:+:) ]}}


In this context, it's necessary to document that the 'target' replica is not 
the guaranteed destination of the replica move process, but just a pivot.  
Also, it make sense to update the CLI tool to accept only the source replica as 
an argument, where the target replica is selected by the system itself (if it's 
not so already).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to