[jira] [Commented] (KUDU-3752) No auto-recovery when WAL fallback target is also GC'd

Marton Greber (Jira) Fri, 13 Mar 2026 03:48:12 -0700


    [ 
https://issues.apache.org/jira/browse/KUDU-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065612#comment-18065612
 ]


Marton Greber commented on KUDU-3752:
-------------------------------------

The correct recovery action in this state is to trigger a remote bootstrap
(tablet copy) of the follower from the leader, which bypasses WAL replication
entirely. This is not done automatically.

> No auto-recovery when WAL fallback target is also GC'd
> ------------------------------------------------------
>
>                 Key: KUDU-3752
>                 URL: https://issues.apache.org/jira/browse/KUDU-3752
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Marton Greber
>            Priority: Major
>
> When a leader detects an LMP_MISMATCH with a follower, it attempts to 
> reconcile
> by falling back to the follower's last committed index and replaying WAL from
> that point. If the WAL for that fallback index has already been garbage
> collected, Kudu correctly logs "the follower will never be able to catch up" —
> but then takes no corrective action. The LMP_MISMATCH status is retried in an
> infinite loop, and the follower remains permanently stuck.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KUDU-3752) No auto-recovery when WAL fallback target is also GC'd

Reply via email to