[
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Evan Huus updated KAFKA-2143:
-----------------------------
Description:
On a cluster of 6 nodes, we recently saw a case where a single under-replicated
partition suddenly appeared, replication lag spiked, and network IO spiked. The
cluster appeared to recover eventually on its own,
Looking at the logs, the thing which failed was partition 7 of the topic
{{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 3.
Here are the interesting log lines:
On node 3 (the leader):
{noformat}
[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
processing fetch request for partition [background_queue,7] offset 3722949957
from follower with correlation id 148185816. Possible cause: Request for offset
3722949957 but we only have log segments in the range 3648049863 to 3722949955.
(kafka.server.ReplicaManager)
[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
processing fetch request for partition [background_queue,7] offset 3722949957
from follower with correlation id 156007054. Possible cause: Request for offset
3722949957 but we only have log segments in the range 3648049863 to 3722949955.
(kafka.server.ReplicaManager)
[2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3:
Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3
(kafka.cluster.Partition)
{noformat}
Note that both replicas suddenly asked for an offset *ahead* of the available
offsets.
And on nodes 1 and 4 (the replicas) many occurrences of the following:
{noformat}
[2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log
background_queue-7 for deletion. (kafka.log.Log) (edited)
{noformat}
Based on my reading, this looks like the replicas somehow got *ahead* of the
leader, asked for an invalid offset, got confused, and re-replicated the entire
topic from scratch to recover (this matches our network graphs, which show 3
sending a bunch of data to 1 and 4).
Taking a stab in the dark at the cause, there appears to be a race condition
where replicas can receive a new offset before the leader has committed it and
is ready to replicate?
was:
On a cluster of 6 nodes, we recently saw a case where a single under-replicated
partition suddenly appeared, replication lag spiked, and network IO spiked. The
cluster appeared to recover eventually on its own,
Looking at the logs, the thing which failed was partition 7 of the topic
{{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 3.
Here are the interesting log lines:
On node 3 (the leader):
{noformat}
[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
processing fetch request for partition [background_queue,7] offset 3722949957
from follower with correlation id 148185816. Possible cause: Request for offset
3722949957 but we only have log segments in the range 3648049863 to 3722949955.
(kafka.server.ReplicaManager)
[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
processing fetch request for partition [background_queue,7] offset 3722949957
from follower with correlation id 156007054. Possible cause: Request for offset
3722949957 but we only have log segments in the range 3648049863 to 3722949955.
(kafka.server.ReplicaManager)
[2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3:
Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3
(kafka.cluster.Partition)
{noformat}
And on nodes 1 and 4 (the replicas) many occurrences of the following:
{noformat}
[2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log
background_queue-7 for deletion. (kafka.log.Log) (edited)
{noformat}
Based on my reading, this looks like the replicas somehow got *ahead* of the
leader, asked for an invalid offset, got confused, and re-replicated the entire
topic from scratch to recover (this matches our network graphs, which show 3
sending a bunch of data to 1 and 4).
Taking a stab in the dark at the cause, there appears to be a race condition
where replicas can receive a new offset before the leader has committed it and
is ready to replicate?
> Replicas get ahead of leader and fail
> -------------------------------------
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
> Issue Type: Bug
> Components: replication
> Affects Versions: 0.8.2.1
> Reporter: Evan Huus
> Assignee: Neha Narkhede
>
> On a cluster of 6 nodes, we recently saw a case where a single
> under-replicated partition suddenly appeared, replication lag spiked, and
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
> processing fetch request for partition [background_queue,7] offset 3722949957
> from follower with correlation id 148185816. Possible cause: Request for
> offset 3722949957 but we only have log segments in the range 3648049863 to
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when
> processing fetch request for partition [background_queue,7] offset 3722949957
> from follower with correlation id 156007054. Possible cause: Request for
> offset 3722949957 but we only have log segments in the range 3648049863 to
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3:
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the
> leader, asked for an invalid offset, got confused, and re-replicated the
> entire topic from scratch to recover (this matches our network graphs, which
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition
> where replicas can receive a new offset before the leader has committed it
> and is ready to replicate?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)