[
https://issues.apache.org/jira/browse/KAFKA-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Swapnil Ghike updated KAFKA-763:
--------------------------------
Attachment: kafka-763-new-v1.patch
Copy pasting the comments from patch new-v1:
1. The leader's log could be partially overlapping with the follower's log. The
only way to get an OffsetOutOfRangeException in such a situation is when the
follower's end offset is ahead of the leader's end offset. This is possible if
there is unclean leader election:
A follower goes down, in the meanwhile the leader keeps appending messages. The
follower comes back up and before it has completely caught up with the leader's
logs, the ISR goes down. The follower is now uncleanly elected as the new
leader, and it appends messages. The old leader comes back up, becomes a
follower and it may find that the current leader's end offset falls between its
own start offset and its own end offset.
In such a case, truncate the follower's log to the current leader's end offset
and continue fetching.
There is a potential for a mismatch between the logs of the two replicas here.
We don't fix this mismatch as of now.
2. Otherwise, the leader's log could be completely non-overlapping with the
follower's log:
i. The follower could have been down for a long time and when it starts up, its
end offset could be smaller than or equal to
the leader's start offset because the leader has deleted old logs
(log.logEndOffset <= leaderStartOffset). OR
ii. Unclean leader election: A follower could be down for a long time. When it
starts up, the ISR goes down before the follower
has the opportunity to even start catching up with the leader's logs. The
follower is now uncleanly elected as the new leader.
The old leader comes back up, becomes a follower and it may find that the
current leader's end offset is smaller than or
equal to its own start offset (log.logStartOffset >= leaderEndOffset).
In both these cases, roll out a new log at the follower with the start offset
equal to the current leader's start offset
and continue fetching.
Other changes:
1. Fixed the error message for autoOffsetReset in ConsumerConfig.
2. Added a method logStartOffset in Log.
> Add an option to replica from the largest offset during unclean leader
> election
> -------------------------------------------------------------------------------
>
> Key: KAFKA-763
> URL: https://issues.apache.org/jira/browse/KAFKA-763
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Swapnil Ghike
> Priority: Blocker
> Labels: kafka-0.8, p2
> Attachments: kafka-763-new-v1.patch, kafka-763_v1.patch
>
>
> If there is an unclean leader election, a follower may have an offset out of
> the range of the leader. Currently, the follower will delete all its data and
> refetch from the smallest offset of the leader. It would be useful to add an
> option to let the follower refetch from the largest offset of the leader
> since refetching from the smallest offset may take some time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira