[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410682#comment-15410682
 ] 

Flavio Junqueira commented on KAFKA-1211:
-----------------------------------------

[~junrao] I get (2), but (1) might need a bit more tweaking because of the 
following. Say the follower executes the steps in the order you describe:

# Truncate log
# Update LGS to reflect the truncated log (flush the prefix of LGS whose start 
offset is up to the log end offset)
# Fetch and update the LGS as messages cross leader boundaries

If the follower crashes in the middle of fetching and updating the LGS, it may 
leave the LGS in an inconsistent state. For example, let's say that it crossed 
the boundary of a generation, it writes to the log, and crashes before updating 
the LGS. I'm actually thinking that it might be better to have the update in 
the LGS first because in the worst case we point to an offset that is not in 
the log, so we know that the LGS entry is invalid. In any case, it sounds like 
there is some work to be done to make sure the LGS and the log are consistent.



> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1211
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1211
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>             Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to hold the produce request 
> in purgatory until replica's HW has larger than the offset instead of just 
> their end-of-log offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to