jolshan opened a new pull request, #13579:
URL: https://github.com/apache/kafka/pull/13579

   
[KAFKA-14561](https://github.com/apache/kafka/commit/56dcb837a2f1c1d8c016cfccf8268a910bb77a36)
 added verification to transactional produce requests to confirm an ongoing 
transaction.
   
   There is an edge case where the transaction is added, but the coordinator is 
still writing the state to the log. In this case, when verifying, we return 
CONCURRENT_TRANSACTIONS and retry. However, the next inflight batch is often 
successful because the write completes. 
   
   When a partition has no entry in the PSM, it will allow any sequence number. 
This means if we retry the first write to the partition (or first write in a 
while) we will never be able to write it and get OutOfOrderSequence exceptions. 
This is a known issue. Since the verification makes this more common, I propose 
allowing verification on pending ongoing state. We will potentially have 
hanging transactions if the coordinator crashes before the writes complete, but 
this is better than endless out of order exceptions and is better than not 
verifying at all. (It is the best compromise)
   
   The good news is part 2 of KIP-890 will allow us to enforce that the first 
write for a transaction is sequence 0 and this issue will go away entirely.  
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to