Dimitrij Denissenko created KAFKA-3251: ------------------------------------------
Summary: Requesting committed offsets results in inconsistent results Key: KAFKA-3251 URL: https://issues.apache.org/jira/browse/KAFKA-3251 Project: Kafka Issue Type: Bug Components: offset manager Affects Versions: 0.9.0.0 Reporter: Dimitrij Denissenko Hi, I am using github.com/Shopify/sarama to retrieve the committed offsets for a high-volume topic, but the bug seems to be actually originating in Kafka itself. I have written a little test to query the offsets of all partitions of one topic, every second. The request looks like this: {code} OffsetFetchRequest{ ConsumerGroup: "my-group-name", Version: 1, TopicPartitions: []TopicPartition{ {TopicName: "logs", Partitions: []int32{0,1,2,3,4,5,6,7} } } {code} For most of the time, the responses are correct, but every 10 minutes or so, there is a little glitch. I am not familiar with the Kafka internals, but it looks like a little race. Here's my log output: {code} ... 2016/02/19 09:48:10 topic=logs partition=00 error=0 offset=206567925 2016/02/19 09:48:10 topic=logs partition=01 error=0 offset=206671019 2016/02/19 09:48:10 topic=logs partition=02 error=0 offset=206567995 2016/02/19 09:48:10 topic=logs partition=03 error=0 offset=205785315 2016/02/19 09:48:10 topic=logs partition=04 error=0 offset=206526677 2016/02/19 09:48:10 topic=logs partition=05 error=0 offset=206713764 2016/02/19 09:48:10 topic=logs partition=06 error=0 offset=206524006 2016/02/19 09:48:10 topic=logs partition=07 error=0 offset=206629121 2016/02/19 09:48:11 topic=logs partition=00 error=0 offset=206572870 2016/02/19 09:48:11 topic=logs partition=01 error=0 offset=206675966 2016/02/19 09:48:11 topic=logs partition=02 error=0 offset=206573267 2016/02/19 09:48:11 topic=logs partition=03 error=0 offset=205790613 2016/02/19 09:48:11 topic=logs partition=04 error=0 offset=206531841 2016/02/19 09:48:11 topic=logs partition=05 error=0 offset=206718513 2016/02/19 09:48:11 topic=logs partition=06 error=0 offset=206529762 2016/02/19 09:48:11 topic=logs partition=07 error=0 offset=206634037 2016/02/19 09:48:12 topic=logs partition=00 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=01 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=02 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=03 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=04 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=05 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=06 error=0 offset=-1 2016/02/19 09:48:12 topic=logs partition=07 error=0 offset=-1 2016/02/19 09:48:13 topic=logs partition=00 error=0 offset=-1 2016/02/19 09:48:13 topic=logs partition=01 error=0 offset=206686020 2016/02/19 09:48:13 topic=logs partition=02 error=0 offset=206583861 2016/02/19 09:48:13 topic=logs partition=03 error=0 offset=205800480 2016/02/19 09:48:13 topic=logs partition=04 error=0 offset=206542733 2016/02/19 09:48:13 topic=logs partition=05 error=0 offset=206728251 2016/02/19 09:48:13 topic=logs partition=06 error=0 offset=206534794 2016/02/19 09:48:13 topic=logs partition=07 error=0 offset=206643853 2016/02/19 09:48:14 topic=logs partition=00 error=0 offset=206584533 2016/02/19 09:48:14 topic=logs partition=01 error=0 offset=206690275 2016/02/19 09:48:14 topic=logs partition=02 error=0 offset=206588902 2016/02/19 09:48:14 topic=logs partition=03 error=0 offset=205805413 2016/02/19 09:48:14 topic=logs partition=04 error=0 offset=206542733 2016/02/19 09:48:14 topic=logs partition=05 error=0 offset=206733144 2016/02/19 09:48:14 topic=logs partition=06 error=0 offset=206540275 2016/02/19 09:48:14 topic=logs partition=07 error=0 offset=206649392 ... {code} As you can see, the returned error code is 0 and there is no obvious reason why the returned offsets are suddenly wrong/blank. I have also added some debugging to our offset committer to make absolutely sure the numbers we are sending are absolutely correct and they are. Any help is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)