[ 
https://issues.apache.org/jira/browse/KAFKA-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan updated KAFKA-14366:
---------------------------
    Description: 
Hi All,

We are facing an issue while the client consumer restart (again not all 
restarts are ending up with this issue) and during the re-balancing scenario, 
sometimes one of the partition offsets goes back a long way from the committed 
offset.

Scenario :

Assume we have 4 instances of consumer and restarts of consumer one after the 
other.
 # At the time of starting restarts assume the offset on partition 10 of a 
topic being consumed is pointing to 50000. (last offset of the topic and 0 lag)
 # When restarts start (rebalancing) suddenly the offsets start pointing to 
20000.
 # While all the restarts are going on the consumer who is attached starts 
reading from 20000 and goes on.
 # Once all rebalance is completed, and all messages from 20000 to 50000 offset 
has been read (where it had stopped initially)
We end up having around 30K duplicates.

(The numbers here are just an example, in production, we are facing huge 
duplicates and every two rebalance during restarts of consumer out of 10 
restart exercise activity ends up in such duplicates and not all partitions and 
only one or two partitions behave this way and randomly)

This seems to be a bug. I am attaching all screenshots for reference as well.

Can someone kindly help out here?

  was:
Hi All,

We are facing an issue while the client consumer restart (again not all 
restarts are ending up with this issue) and during the re-balancing scenario, 
sometimes one of the partition offsets goes back a long way from the committed 
offset.

Scenario :

Assume we have 4 instances of consumer and restarts of consumer one after the 
other.
 # At the time of starting restarts assume the offset on partition 10 of a 
topic being consumed is pointing to 50000. (last offset of the topic)
 # When restarts start suddenly the offsets start pointing to 20000.
 # While all the restarts are going on the consumer who is attached starts 
reading from 20000 and goes on.
 # Once all rebalance is completed, and all messages from 20000 to 50000 offset 
has been read (where it had stopped initially)
We end up having around 30K duplicates.

(The numbers here are just an example, in production, we are facing huge 
duplicates and every two rebalance during restarts of consumer out of 10 
restart exercise activity ends up in such duplicates and not all partitions and 
only one or two partitions behave this way and randomly)

This seems to be a bug. I am attaching all screenshots for reference as well.

Can someone kindly help out here?


> Kafka consumer rebalance issue, offsets points back to very old committed 
> offset
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-14366
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14366
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, offset manager
>    Affects Versions: 2.8.1
>         Environment: Production
>            Reporter: Chetan
>            Priority: Major
>         Attachments: rebalance issue.docx
>
>
> Hi All,
> We are facing an issue while the client consumer restart (again not all 
> restarts are ending up with this issue) and during the re-balancing scenario, 
> sometimes one of the partition offsets goes back a long way from the 
> committed offset.
> Scenario :
> Assume we have 4 instances of consumer and restarts of consumer one after the 
> other.
>  # At the time of starting restarts assume the offset on partition 10 of a 
> topic being consumed is pointing to 50000. (last offset of the topic and 0 
> lag)
>  # When restarts start (rebalancing) suddenly the offsets start pointing to 
> 20000.
>  # While all the restarts are going on the consumer who is attached starts 
> reading from 20000 and goes on.
>  # Once all rebalance is completed, and all messages from 20000 to 50000 
> offset has been read (where it had stopped initially)
> We end up having around 30K duplicates.
> (The numbers here are just an example, in production, we are facing huge 
> duplicates and every two rebalance during restarts of consumer out of 10 
> restart exercise activity ends up in such duplicates and not all partitions 
> and only one or two partitions behave this way and randomly)
> This seems to be a bug. I am attaching all screenshots for reference as well.
> Can someone kindly help out here?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to