Justine Olshan created KAFKA-19446:
--------------------------------------

             Summary: TV2 late marker can violate EOS guarantees.
                 Key: KAFKA-19446
                 URL: https://issues.apache.org/jira/browse/KAFKA-19446
             Project: Kafka
          Issue Type: Task
    Affects Versions: 4.0.0, 4.1.0
            Reporter: Justine Olshan
            Assignee: Justine Olshan


One case we missed in KIP-890 is if a late arriving WriteTxnMarkerRequest comes 
in to a partition for a transaction using TV2. 

Because we write a marker with epoch +1, we send the request with epoch +1. Due 
to the somewhat relaxed check on epochs at the log layer 
([https://github.com/apache/kafka/blob/fd70290633191b6f53a9d4ddb24e3a8b619fcd3f/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerAppendInfo.java#L211)]
 , we can actually accept a late arriving request for the previous transaction 
since the epoch will be the same. 

We should tighten up this check to not allow the same epoch when using TV2. In 
other words, the marker should always be >= epoch + 1 the current producer 
state epoch. (The epoch can be greater than +1 if we restart the producer and 
bump epoch.) We just need a good way to tell if a marker is meant for a TV2 
transaction. 

This + 1 works even if we didn't produce records, since the previous marker 
will update the epoch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to