Jason Gustafson created KAFKA-10080: ---------------------------------------
Summary: IllegalStateException after duplicate CompleteCommit append to transaction log Key: KAFKA-10080 URL: https://issues.apache.org/jira/browse/KAFKA-10080 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson We noticed this exception in the logs: {code} java.lang.IllegalStateException: TransactionalId foo completing transaction state transition while it does not have a pending state at kafka.coordinator.transaction.TransactionMetadata.$anonfun$completeTransitionTo$1(TransactionMetadata.scala:357) at kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:353) at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$3(TransactionStateManager.scala:595) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:188) at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$15$adapted(TransactionStateManager.scala:587) at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:126) at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:70) at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:107) at kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:121) at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:378) at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:280) at kafka.cluster.DelayedOperations.checkAndCompleteAll(Partition.scala:122) at kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:1023) at kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:740) {code} After inspection, we found that there were two CompleteCommit entries in the transaction state log which explains the failed transition. Indeed the logic for writing the CompleteCommit message does seem prone to race conditions. -- This message was sent by Atlassian Jira (v8.3.4#803005)