[ 
https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-15552.
-------------------------------
    Resolution: Fixed

> Duplicate Producer ID blocks during ZK migration
> ------------------------------------------------
>
>                 Key: KAFKA-15552
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15552
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1
>            Reporter: David Arthur
>            Assignee: David Arthur
>            Priority: Critical
>             Fix For: 3.5.2, 3.6.1
>
>
> When migrating producer ID blocks from ZK to KRaft, we are taking the current 
> producer ID block from ZK and writing it's "firstProducerId" into the 
> producer IDs KRaft record. However, in KRaft we store the _next_ producer ID 
> block in the log rather than storing the current block like ZK does. The end 
> result is that the first block given to a caller of AllocateProducerIds is a 
> duplicate of the last block allocated in ZK mode.
>  
> This can result in duplicate producer IDs being given to transactional or 
> idempotent producers. In the case of transactional producers, this can cause 
> long term problems since the producer IDs are persisted and reused for a long 
> time.
> The time between the last producer ID block being allocated by the ZK 
> controller and all the brokers being restarted following the metadata 
> migration is when this bug is possible.
>  
> Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException 
> and possibly some producer epoch validation errors. To see if a cluster is 
> affected by this bug, search for the offending producer ID and see if it is 
> being used by more than one producer.
>  
> For example, the following error was observed
> {code}
> Out of order sequence number for producer 376000 at offset 381338 in 
> partition REDACTED: 0 (incoming seq. number), 21 (current end sequence 
> number) 
> {code}
> Then searching for "376000" on 
> org.apache.kafka.clients.producer.internals.TransactionManager logs, two 
> brokers both show the same producer ID being provisioned
> {code}
> Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1
> Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to