[
https://issues.apache.org/jira/browse/KUDU-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065612#comment-18065612
]
Marton Greber commented on KUDU-3752:
-------------------------------------
The correct recovery action in this state is to trigger a remote bootstrap
(tablet copy) of the follower from the leader, which bypasses WAL replication
entirely. This is not done automatically.
> No auto-recovery when WAL fallback target is also GC'd
> ------------------------------------------------------
>
> Key: KUDU-3752
> URL: https://issues.apache.org/jira/browse/KUDU-3752
> Project: Kudu
> Issue Type: Bug
> Reporter: Marton Greber
> Priority: Major
>
> When a leader detects an LMP_MISMATCH with a follower, it attempts to
> reconcile
> by falling back to the follower's last committed index and replaying WAL from
> that point. If the WAL for that fallback index has already been garbage
> collected, Kudu correctly logs "the follower will never be able to catch up" —
> but then takes no corrective action. The LMP_MISMATCH status is retried in an
> infinite loop, and the follower remains permanently stuck.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)