We have a 6 broker cluster running in AWS in 3 availability zones.  A few
times while under slight load (40k messages/second, roughly) we have seen a
replica try to request a message from the leader by an index that is
slightly in the future, 3-6 messages usually.  When this happens the
replica throws an error, deletes all of its data for that partition, and
resyncs from the beginning of the leader.  Given that the offset difference
is so small I suspect a latency/timing issue, but am uncertain what to
tweak.  Thank you in advance for any assistance!

Leader logs:

[2015-04-15 02:07:21,328] ERROR [Replica Manager on Broker 2]: Error when
processing fetch request for partition [xxx.prod,1] offset 127413332 from
follower with correlation id 35310725. Possible cause: Request for offset
127413332 but we only have log segments in the range 429569 to 127413328.
(kafka.server.ReplicaManager)
[2015-04-15 02:07:23,593] INFO Partition [xxx.prod,1] on broker 2:
Shrinking ISR for partition [xxx.prod,1] from 2,6 to 2
(kafka.cluster.Partition)

Follower logs:
...
[2015-04-15 02:08:02,085] INFO Scheduling log segment 124662576 for log
xxx.prod-1 for deletion. (kafka.log.Log)
[2015-04-15 02:08:02,086] INFO Scheduling log segment 126360465 for log
xxx.prod-1 for deletion. (kafka.log.Log)
[2015-04-15 02:08:02,121] WARN [ReplicaFetcherThread-3-2], Replica 6 for
partition [xxx.prod,1] reset its fetch offset from 429569 to current leader
2's start offset 429569 (kafka.server.ReplicaFetcherThread)
[2015-04-15 02:08:02,131] ERROR [ReplicaFetcherThread-3-2], Current offset
127413332 for partition [xxx.prod,1] out of range; reset offset to 429569
(kafka.server.ReplicaFetcherThread)

Relevant config:

num.network.threads=8
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
default.replication.factor=2
num.replica.fetchers=4
replica.fetch.max.bytes=1048576
replica.fetch.wait.max.ms=3000
replica.high.watermark.checkpoint.interval.ms=5000
replica.socket.timeout.ms=30000
replica.socket.receive.buffer.bytes=65536
replica.lag.time.max.ms=10000
replica.lag.max.messages=4000
controller.socket.timeout.ms=30000
controller.message.queue.size=100000

Reply via email to