Meyer Kizner created KAFKA-4414:
-----------------------------------
Summary: Unexpected "Halting because log truncation is not allowed"
Key: KAFKA-4414
URL: https://issues.apache.org/jira/browse/KAFKA-4414
Project: Kafka
Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Meyer Kizner
Our Kafka installation runs with unclean leader election disabled, so brokers
halt when they find that their message offset is ahead of the leader's offset
for a topic. We had two brokers halt today with this issue. After much time
spent digging through the logs, I believe the following timeline describes what
occurred and points to a plausible hypothesis as to what happened.
* B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the
leader, but B1 is the preferred leader. The controller runs on B3.
* B1 fails, but the controller does not detect the failure immediately.
* B2 receives a message from a producer and B3 fetches it to stay up to date.
B2 has not accepted the message, because B1 is down and so has not acknowledged
the message.
* The controller triggers a preferred leader election, making B1 the leader,
and notifies all replicas.
* Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper
expires, so the controller reassigns B2 to be leader again and notifies all
replicas.
* Because B3 is the controller, while B2 is on another box, B3 hears about both
of these events before B2 hears about either. B3 truncates its log to the high
water mark (before the pending message) and resumes fetching from B2.
* B3 fetches the pending message from B2 again.
* B2 learns that it has been displaced and then reelected, and truncates its
log to the high water mark, before the pending message.
* The next time B3 tries to fetch from B2, it sees that B2 is missing the
pending message and halts.
In this case, there was no data loss or inconsistency. I haven't fully thought
through whether either would be possible, but it seems likely that they would
be, especially if there had been multiple producers to this topic.
I'm not completely certain about this timeline, but this sequence of events
appears to at least be possible. Looking a bit through the controller code,
there doesn't seem to be anything that forces {{LeaderAndIsrRequest}}s to be
sent in a particular order. If someone with more knowledge of the code base
believes this is incorrect, I'd be happy to post the logs and/or do some more
digging.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)