[ https://issues.apache.org/jira/browse/KAFKA-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-10080. ------------------------------------- Fix Version/s: 2.6.0 Resolution: Fixed > IllegalStateException after duplicate CompleteCommit append to transaction log > ------------------------------------------------------------------------------ > > Key: KAFKA-10080 > URL: https://issues.apache.org/jira/browse/KAFKA-10080 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Fix For: 2.6.0 > > > We noticed this exception in the logs: > {code} > java.lang.IllegalStateException: TransactionalId foo completing transaction > state transition while it does not have a pending state > > at > kafka.coordinator.transaction.TransactionMetadata.$anonfun$completeTransitionTo$1(TransactionMetadata.scala:357) > at > kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:353) > at > kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$3(TransactionStateManager.scala:595) > > > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:188) > at > kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$15$adapted(TransactionStateManager.scala:587) > > > at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:126) > at > kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:70) > at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:107) > at > kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:121) > at > kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:378) > at > kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:280) > at > kafka.cluster.DelayedOperations.checkAndCompleteAll(Partition.scala:122) > at > kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:1023) > at > kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:740) > {code} > After inspection, we found that there were two CompleteCommit entries in the > transaction state log which explains the failed transition. Indeed the logic > for writing the CompleteCommit message does seem prone to race conditions. -- This message was sent by Atlassian Jira (v8.3.4#803005)