[ 
https://issues.apache.org/jira/browse/KAFKA-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753593#comment-17753593
 ] 

Fei Xie commented on KAFKA-9199:
--------------------------------

Hi [~hachikuji] I took some look into the relevant code but it seemed that the 
proposed solution won't work because the sequence number wraps around to 0 from 
INT_MAX which means that we cannot know whether a sequence number if lower or 
higher, we can only know OUT_OF_ORDER.

> Improve handling of out of sequence errors lower than last acked sequence
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-9199
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9199
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>            Reporter: Jason Gustafson
>            Priority: Major
>
> The broker attempts to cache the state of the last 5 batches in order to 
> enable duplicate detection. This caching is not guaranteed across restarts: 
> we only write the state of the last batch to the snapshot file. It is 
> possible in some cases for this to result in a sequence such as the following:
>  # Send sequence=n
>  # Sequence=n successfully written, but response is not received
>  # Leader changes after broker restart
>  # Send sequence=n+1
>  # Receive successful response for n+1
>  # Sequence=n times out and is retried, results in out of order sequence
> There are a couple problems here. First, it would probably be better for the 
> broker to return DUPLICATE_SEQUENCE_NUMBER when a sequence number is received 
> which is lower than any of the cached batches. Second, the producer handles 
> this situation by just retrying until expiration of the delivery timeout. 
> Instead it should just fail the batch. 
> This issue popped up in the reassignment system test. It ultimately caused 
> the test to fail because the producer was stuck retrying the duplicate batch 
> repeatedly until ultimately giving up.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to