[jira] [Commented] (KAFKA-5396) Consumer reading uninterrupted from beginning of log can read the same message multiple times.

Apurva Mehta (JIRA) Wed, 07 Jun 2017 00:16:46 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040367#comment-16040367
 ]


Apurva Mehta commented on KAFKA-5396:
-------------------------------------

[~hachikuji] found the issue.

{noformat}
[2017-06-07 05:51:33,401] DEBUG Group concurrent_consumer fetching committed 
offsets for partitions: [output-topic-1, output-topic-2, output-topic-0] 
(org.apa
che.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-07 05:51:33,415] DEBUG Group concurrent_consumer has no committed 
offset for partition output-topic-1 
(org.apache.kafka.clients.consumer.internals.Co
nsumerCoordinator)
[2017-06-07 05:51:33,417] TRACE Sending ListOffsetRequest 
(type=ListOffsetRequest, replicaId=-1, partitionTimestamps={output-topic-1=-2, 
output-topic-2=-2}, m
inVersion=2) to broker knode08:9092 (id: 1 rack: null) 
(org.apache.kafka.clients.consumer.internals.Fetcher)
[2017-06-07 05:51:33,451] DEBUG Handling ListOffsetResponse response for 
output-topic-1. Fetched offset 0, timestamp -1 
(org.apache.kafka.clients.consumer.int
ernals.Fetcher)
[2017-06-07 05:51:33,452] DEBUG Resetting offset for partition output-topic-1 
to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
{noformat}

Since we don't commit offsets, the positions are reset during the group 
rebalance. This happens just before the reset above:
{noformat}
[2017-06-07 05:51:33,145] INFO [GroupCoordinator 3]: Preparing to restabilize 
group concurrent_consumer with old generation 2 
(kafka.coordinator.group.GroupCoordinator)
{noformat}

> Consumer reading uninterrupted from beginning of log can read the same 
> message multiple times.
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5396
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5396
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>         Attachments: KAFKA-5396.tar.gz
>
>
> I noticed this when running the transactions system test with hard broker 
> bounces. We have a consumer in READ_COMMITTED mode reading from the tail of 
> the log as the writes are appended.
> This test has failed once because the concurrent consumer returned duplicate 
> data. The actual log has no duplicates, so the problem is in the consumer. 
> One of the duplicate values is '0', and is at offset 250 in output-topic-1. 
> The first time it is read, we see the following.
> {noformat}
> [2017-06-07 05:50:34,601] TRACE Returning fetched records at offset 0 for 
> assigned partition output-topic-0 and update position to 250 
> (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,602] TRACE Preparing to read 2967 bytes of data for 
> partition output-topic-1 with offset 250 
> (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,602] TRACE Updating high watermark for partition 
> output-topic-1 to 502 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,613] TRACE Returning fetched records at offset 250 for 
> assigned partition output-topic-1 and update position to 500 
> (org.apache.kafka.clients.consumer.internals.Fetcher)
> {noformat}
> The next time it is read, we see this
> {noformat}
> [2017-06-07 05:51:36,386] TRACE Preparing to read 169858 bytes of data for 
> partition output-topic-1 with offset 0 
> (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:51:36,389] TRACE Updating high watermark for partition 
> output-topic-1 to 13053 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:51:36,391] TRACE Returning fetched records at offset 0 for 
> assigned partition output-topic-1 and update position to 500 
> (org.apache.kafka.clients.consumer.internals.Fetcher)
> {noformat}
> For some reason, the fetcher re-sent the data from offset 0, an reset the 
> position to 500. 
> This is the plain consumer doing 'poll' in a loop until it is killed. So this 
> position reset is puzzling. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5396) Consumer reading uninterrupted from beginning of log can read the same message multiple times.

Reply via email to