[ 
https://issues.apache.org/jira/browse/KUDU-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066546#comment-18066546
 ] 

Marton Greber commented on KUDU-3752:
-------------------------------------

[~aserbin] mentioned in another channel the following: "for Kudu masters we 
could enable a special WAL segment anchoring mode where segments can be GC-ed 
only if all of the system catalog tablet replicas are caught up to the GC 
cut-off. "
^ which is I think worth to consider.

> No auto-recovery when WAL fallback target is also GC'd
> ------------------------------------------------------
>
>                 Key: KUDU-3752
>                 URL: https://issues.apache.org/jira/browse/KUDU-3752
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Marton Greber
>            Priority: Major
>
> When a leader detects an LMP_MISMATCH with a follower, it attempts to 
> reconcile
> by falling back to the follower's last committed index and replaying WAL from
> that point. If the WAL for that fallback index has already been garbage
> collected, Kudu correctly logs "the follower will never be able to catch up" —
> but then takes no corrective action. The LMP_MISMATCH status is retried in an
> infinite loop, and the follower remains permanently stuck.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to