[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2017-02-09 Thread Mathieu Filotto (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860712#comment-15860712
 ] 

Mathieu Filotto commented on KAFKA-2143:


Is the upgrade the only option ?
Is 0.10 upgrade is working too ?

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2016-04-14 Thread Jason Rosenberg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242428#comment-15242428
 ] 

Jason Rosenberg commented on KAFKA-2143:


After looking at KAFKA-2477, I think that is the actual issue reported in this 
ticket (as opposed to the unrelated issue the first commenter raised), and I 
see that KAFKA-2477 seems to have been fixed.  So upgrading to 0.9.X is the 
thing to do, it would appear.

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2016-04-14 Thread Jason Rosenberg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242412#comment-15242412
 ] 

Jason Rosenberg commented on KAFKA-2143:


Hello, I recently experienced this issue, as described in the original 
description above.  We are running 0.8.2.2.
Note, in the original description above (and in my case as well), there was no 
evidence of a network glitch, and no evidence of any partition leadership 
change.  So, I'm concerned whether the discussion that follows in this ticket 
(which describes network glitches and leadership changes almost exclusively) 
really addressed the original issue as reported (and my recent observation of 
this as well).

Can someone comment on this please?  [~becket_qin]

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2016-02-02 Thread chenlax
i meet the same issues,the error log,
Error when processing fetch request for partition [To_S3_comm_V0_10,2]
offset 456234794 from follower with correlation id 254117341. Possible cause: 
Request for offset 456234794 but we only have log segmen
ts in the range 432322850 to 456234793. (kafka.server.ReplicaManager)

and i find in the handleOffsetOutOfRange,it just check (leaderEndOffset < 
replica.logEndOffset.messageOffset),if not, will delete the all log

INFO Scheduling log segment 432322850 for log To_S3_comm_V0_10-2 for deletion. 
(kafka.log.Log)
...
INFO Deleting segment 434379909 from log To_S3_comm_V0_10-2. (kafka.log.Log)

i think,it must add check  (log.logEndOffset < leaderStartOffset) when 
leaderEndOffset not small than replica.logEndOffset.messageOffset.


Thanks,
Lax


> Date: Fri, 4 Sep 2015 00:41:47 +
> From: j...@apache.org
> To: dev@kafka.apache.org
> Subject: [jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail
> 
> 
> [ 
> https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730109#comment-14730109
>  ] 
> 
> Jun Rao commented on KAFKA-2143:
> 
> 
> [~becket_qin], since before step (3), both A and B are in ISR, the last 
> committed offset in A can't be larger than 3000. So, in step (3), if A 
> becomes a follower, it has to first truncate its log to last committed offset 
> before fetching. So, at that point, A's fetch offset can't be larger than 
> 3000 and therefore won't be out of range.
> 
> The following is a alternative scenario that can cause this.
> 
> 1) Broker A (leader) receives messages to 5000
> 2) Broker B (follower) receives messages to 3000 (it is still in ISR because 
> of replica.lag.max.messages)
> 3) For some reason, B is dropped out of ISR.
> 4) Broker A (the only one in ISR) commits messages to 5000.
> 5) For some reason, Broker A is considered dead and Broker B is live.
> 6) Broker B is selected as the new leader (unclean leader election) and is 
> the only one in ISR.
> 7) Broker A is considered live again and starts fetching from 5000 (last 
> committed offset) and gets OffsetOutOfRangeException.
> 8) In the mean time, B receives more messages to offset 6000.
> 9) Broker A tries to handle OffsetOutOfRangeException and finds out leader 
> B's log end offset is now larger than its log end offset and truncates all 
> its log.
> 
> Your patch reduces the amount of the data that Broker A needs to replicate in 
> step 9, which is probably fine. However, we probably should first verify if 
> this is indeed what's happening since it seems that it should happen rarely. 
> Also, KAFKA-2477 reports a similar issue w/o any leadership change. So, may 
> be there is something else that can cause this.
> 
> > Replicas get ahead of leader and fail
> > -
> >
> > Key: KAFKA-2143
> > URL: https://issues.apache.org/jira/browse/KAFKA-2143
> > Project: Kafka
> >  Issue Type: Bug
> >  Components: replication
> >Affects Versions: 0.8.2.1
> >Reporter: Evan Huus
> >Assignee: Jiangjie Qin
> >
> > On a cluster of 6 nodes, we recently saw a case where a single 
> > under-replicated partition suddenly appeared, replication lag spiked, and 
> > network IO spiked. The cluster appeared to recover eventually on its own,
> > Looking at the logs, the thing which failed was partition 7 of the topic 
> > {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> > 3. Here are the interesting log lines:
> > On node 3 (the leader):
> > {noformat}
> > [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> > processing fetch request for partition [background_queue,7] offset 
> > 3722949957 from follower with correlation id 148185816. Possible cause: 
> > Request for offset 3722949957 but we only have log segments in the range 
> > 3648049863 to 3722949955. (kafka.server.ReplicaManager)
> > [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> > processing fetch request for partition [background_queue,7] offset 
> > 3722949957 from follower with correlation id 156007054. Possible cause: 
> > Request for offset 3722949957 but we only have log segments in the range 
> > 3648049863 to 3722949955. (kafka.server.ReplicaManager)
> > [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> > Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> > (kafka.cluster.Partition)
> > {noformat}
> > Note that both rep

[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2016-01-29 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123866#comment-15123866
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

Thanks, [~guozhang]. I almost forgot I had this ticket :)

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2016-01-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122919#comment-15122919
 ] 

ASF GitHub Bot commented on KAFKA-2143:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/129


> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-12-14 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056999#comment-15056999
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

[~ottomata], KAFKA-2477 is a more likely cause. You may want to take a look at 
that.

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-12-14 Thread Andrew Otto (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056672#comment-15056672
 ] 

Andrew Otto commented on KAFKA-2143:


I'd like to add that Wikimedia is experiencing this as well.  It only started 
happening to us recently when we started producing to Kafka via a PHP process 
serving a web request.  This process is short lived, so only connects to Kafka 
in order to produce a single message, and then disconnects.  The only topic 
that this happens to is the one that this PHP process produces to.

Everything recovers as it should, but something is definitely wrong here.  Why 
is a follower trying to consume an offset ahead of what the leader has?

We're running 0.8.2.1 (with the snappy compression bugfix backported).

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-09-03 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730117#comment-14730117
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

[~junrao], you are right. I realized there should be unclean leader election in 
this case. I'll check the code further to see if there is any finding for 
KAFKA-2477.

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-09-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730109#comment-14730109
 ] 

Jun Rao commented on KAFKA-2143:


[~becket_qin], since before step (3), both A and B are in ISR, the last 
committed offset in A can't be larger than 3000. So, in step (3), if A becomes 
a follower, it has to first truncate its log to last committed offset before 
fetching. So, at that point, A's fetch offset can't be larger than 3000 and 
therefore won't be out of range.

The following is a alternative scenario that can cause this.

1) Broker A (leader) receives messages to 5000
2) Broker B (follower) receives messages to 3000 (it is still in ISR because of 
replica.lag.max.messages)
3) For some reason, B is dropped out of ISR.
4) Broker A (the only one in ISR) commits messages to 5000.
5) For some reason, Broker A is considered dead and Broker B is live.
6) Broker B is selected as the new leader (unclean leader election) and is the 
only one in ISR.
7) Broker A is considered live again and starts fetching from 5000 (last 
committed offset) and gets OffsetOutOfRangeException.
8) In the mean time, B receives more messages to offset 6000.
9) Broker A tries to handle OffsetOutOfRangeException and finds out leader B's 
log end offset is now larger than its log end offset and truncates all its log.

Your patch reduces the amount of the data that Broker A needs to replicate in 
step 9, which is probably fine. However, we probably should first verify if 
this is indeed what's happening since it seems that it should happen rarely. 
Also, KAFKA-2477 reports a similar issue w/o any leadership change. So, may be 
there is something else that can cause this.

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680467#comment-14680467
 ] 

ASF GitHub Bot commented on KAFKA-2143:
---

GitHub user becketqin opened a pull request:

https://github.com/apache/kafka/pull/129

KAFKA-2143: fix replica offset truncate to beginning during leader 
migration.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/becketqin/kafka KAFKA-2143

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #129


commit 71f8a4716e1f0b4fc2bd88aa30fe38aef8a9f92e
Author: Jiangjie Qin becket@gmail.com
Date:   2015-08-03T02:22:02Z

Fix for KAFKA-2134, fix replica offset truncate to beginning during leader 
migration.




 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin
 Fix For: 0.8.3


 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-06 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660629#comment-14660629
 ] 

Ewen Cheslack-Postava commented on KAFKA-2143:
--

[~wushujames] I think one easy way for a broker to be in ISR for one topic and 
out of ISR for another is a partition -- if topic A has broker 1 as leader and 
topic B has broker 2 as leader, then broker 3 replicating A and B could be in 
ISR for A, but unable to replicate data for B if there is a partition between 
brokers 1 and 3.

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin
 Fix For: 0.8.3


 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-05 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659477#comment-14659477
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

In that case, yes, messages 3000-5000 will be lost. If producer is not 
producing with acks=-1, that is the expected behavior. If acks=-1, as long as 
producer got callback from broker, that means the message has been appended to 
all the brokers in ISR (NOT all the replicas!)

Here is a slide describing how to avoid data loss if you are interested:
http://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kafka-49753844

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin
 Fix For: 0.8.3


 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-05 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659503#comment-14659503
 ] 

James Cheng commented on KAFKA-2143:


Thanks [~becket_qin]. [~ewencp] sent me that slidedeck of yours. I added all 
your suggested settings to my team's Kafka recommended settings wiki. :)

Even with ack=-1, I think it's still possible for a replica to lag behind the 
leader by 2000 messages, right? It would be lagging behind by 2000 messages 
that were produced but not-yet-acknowledged. Not sure if losing those would 
count as message-loss, since they weren't actually ack'd?

replica.lag.max.messages is a broker-level config setting, not a topic level 
config setting. I think that means that it's hard to find a setting that works 
for high-volume topics and low-volume topics, right?

And... a bit of Googling around, and I think my observation is the exact thing 
that inspired 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-16+-+Automated+Replica+Lag+Tuning
 and  
http://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/,
 right?

Is it possible for broker to be in the ISR for one topic, and to be out of the 
ISR for another topic?

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin
 Fix For: 0.8.3


 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-05 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659277#comment-14659277
 ] 

James Cheng commented on KAFKA-2143:


I'm trying to understand part of [~bugzmanov]'s description. If 
unclean.leader.election.enable=false, then in Step 5, as you say, Broker A is 
kicked out of the ISR list.

What happens, then, to the messages 3000-5000, that were originally in Broker 
A? Are they just lost?

What was the producer settings for request.required.acks (old producer) or acks 
(new producer) in this scenario? Was it 0 or 1? If it was -1 (old producer) or 
all (new producer), is it possible for the replica lag to get that large?

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin
 Fix For: 0.8.3


 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-02 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651344#comment-14651344
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

Actually I realized KAFKA-2334 is orthogonal to this issue. I submitted patch 
for this ticket and will leave KAFKA-2334 to be solved separately.

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Jiangjie Qin

 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-07-31 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650146#comment-14650146
 ] 

Jiangjie Qin commented on KAFKA-2143:
-

Hmm, with current design when the follower got OffsetOutOfRangeException during 
fetch, and if *leader's log end offset is smaller than follower's log end 
offset*, follower will not truncate to zero but the current leader's log end 
offsets. That means in step (5), broker A should start to fetch from offset 
3000 instead of drop to 0 unless broker B really have nothing.

However, if the follower's log end offsets is smaller than the leader's log end 
offsets but we received offset out of range exception. We assume it was because 
the follower's log end offsets is too small and behind the earliest available 
offset of leader. But it might not necessarily be true, because we are fetching 
the latest offset again after we receive the exception. So it is possible that 
when the exception was returned, follower was ahead of the leader, but when it 
tries to handle the exception, since the new leader has got some new messages 
appended, the follower is now behind the leader. In that case, the follower 
will truncate its log to the log start offset of leader, which is the problem 
you described.

I think what happened is as below:

1) Broker A (leader) has committed offset up-to 5000
2) Broker B (follower) has committed offset up to 3000 (he is still in ISR 
because of replica.lag.max.messages)
***network glitch happens***
3) Broker B becomes a leader, Broker A becomes a follower
4) Broker A fetch from Broker B and got an OffsetOutOfRangeException.
4.1) Broker B got some new message and its log end offset become 6000.
4.2) Broker A tries to handle the Exception so it checks the log end offset on 
broker B and found it is greater than broker A's log end offset so it truncate 
itself to the starting offset of Broker B.

And the rest is pretty what you described.

By design at step (3) when Broker A become follower, it will truncate its log 
to its high watermark, which should be 3000. There is a possible related issue 
KAFKA-2334, which might make 3000 not the real high watermark on Broker B. And 
that issue triggered the offset out of range in step (4).

So it looks to me KAFKA-2334 and the ReplicaFetcherThread bug together caused 
the issue. I'll take both of the issues and see if they can be solved together.

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus

 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA

[jira] [Commented] (KAFKA-2143) Replicas get ahead of leader and fail

2015-07-30 Thread Rafael Bagmanov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648444#comment-14648444
 ] 

Rafael Bagmanov commented on KAFKA-2143:


I would like to add that we are experiencing this issue quite often. 
And the situation get worse when we have configs like this

replication factor: 2
unclean.leader.election.enable: true 

This leads to loss of a whole partition in case of two network glitches 
happening in a row. 

The failure scenario looks like this:
we have replica.lag.max.messages: 4000 so I'm guessing follower is always 
little bit behind leader for high volume topic (which means leader has a larger 
offset value than the follower)

1) Broker A (leader) has committed offset up-to 5000 
2) Broker B (follower) has committed offset up to 3000 (he is still in ISR 
because of  replica.lag.max.messages)
***network glitch happens***
3) Broker B becomes a leader, Broker A becomes a follower
4) Broker A (follower) asks leader for messages starting from 5000 
5) Broker A (follower)  receives message that this is invalid offset (Broker B 
has only 3000) and drops partition to 0
***network glitch happens***
6) Broker A becomes a leader (unclean election), Broker B becomes a follower
7) Broker B (follower) ask leader for messages starting from 3000
8) Broker B (follower) receives message that this is invalid offset (Broker A 
has only 0) and drops partition to 0

As a result we lost partition because of 2 network glitches. 

And if the configs are 
replication factor: 2
unclean.leader.election.enable: false (!)

the scenario repeats up to 5th step, but then Broker A got kicked out of ISR  
and unclean election is not happening. 

 Replicas get ahead of leader and fail
 -

 Key: KAFKA-2143
 URL: https://issues.apache.org/jira/browse/KAFKA-2143
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus

 On a cluster of 6 nodes, we recently saw a case where a single 
 under-replicated partition suddenly appeared, replication lag spiked, and 
 network IO spiked. The cluster appeared to recover eventually on its own,
 Looking at the logs, the thing which failed was partition 7 of the topic 
 {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
 3. Here are the interesting log lines:
 On node 3 (the leader):
 {noformat}
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 148185816. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
 processing fetch request for partition [background_queue,7] offset 3722949957 
 from follower with correlation id 156007054. Possible cause: Request for 
 offset 3722949957 but we only have log segments in the range 3648049863 to 
 3722949955. (kafka.server.ReplicaManager)
 [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
 Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
 (kafka.cluster.Partition)
 {noformat}
 Note that both replicas suddenly asked for an offset *ahead* of the available 
 offsets.
 And on nodes 1 and 4 (the replicas) many occurrences of the following:
 {noformat}
 [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
 background_queue-7 for deletion. (kafka.log.Log) (edited)
 {noformat}
 Based on my reading, this looks like the replicas somehow got *ahead* of the 
 leader, asked for an invalid offset, got confused, and re-replicated the 
 entire topic from scratch to recover (this matches our network graphs, which 
 show 3 sending a bunch of data to 1 and 4).
 Taking a stab in the dark at the cause, there appears to be a race condition 
 where replicas can receive a new offset before the leader has committed it 
 and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)