[
https://issues.apache.org/jira/browse/KAFKA-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dong Lin resolved KAFKA-6636.
-----------------------------
Resolution: Fixed
This is no longer an issue after KAFKA-3978; Ensure high watermark is always
positive.
> ReplicaFetcherThread should not die if hw < 0
> ---------------------------------------------
>
> Key: KAFKA-6636
> URL: https://issues.apache.org/jira/browse/KAFKA-6636
> Project: Kafka
> Issue Type: Improvement
> Reporter: Dong Lin
> Assignee: Dong Lin
> Priority: Major
>
> ReplicaFetcherThread can die in the following scenario:
>
> 1) Partition P1 has replica set size 1. Broker A is the leader. The segment
> is empty and log start offset is 100
> 2) User executes partition reassignment to change replica set from \{A} to
> \{B, C}
> 3) Broker B starts ReplicaFetcherThread, which triggers
> handleOffsetOutOfRange(), truncates the log fully and start at offset 100. At
> this moment its high watermark is still 0 (or -1). Same for broker C.
> 4) Broker B sends FetchRequest to A at offset 100, broker A immediately adds
> broker B to ISR set, and controller moves leadership to broker B.
> 5) Broker B handles LeaderAndIsrRequest to become leader. It calls
> `leaderReplica.convertHWToLocalOffsetMetadata()` to initialize its HW. Since
> its HW was smaller than logStartOffset=100, now its HW will be overridden to
> LogOffsetMetadata.UnknownOffsetMetadata, i.e. -1.
> 6) Broker C handles LeaderAndIsrRequest to fetch from broker B. Broker C
> updates its HW to the FetchRequest's HW, i.e. -1. Then broker C calls
> replica.maybeIncrementLogStartOffset(leaderLogStartOffset) where
> leaderLogStartOffset=100. This cause exception because leaderLogStartOffset >
> HW. This is an unhandled exception and thus the ReplicaFetcherThread will exit
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)