Jason Gustafson created KAFKA-14830:
---------------------------------------
Summary: Illegal state error in transactional producer
Key: KAFKA-14830
URL: https://issues.apache.org/jira/browse/KAFKA-14830
Project: Kafka
Issue Type: Bug
Affects Versions: 3.1.2
Reporter: Jason Gustafson
We have seen the following illegal state error in the producer:
{code:java}
[Producer clientId=client-id2, transactionalId=transactional-id] Transiting to
abortable error state due to org.apache.kafka.common.errors.TimeoutException:
Expiring 1 record(s) for topic-0:120027 ms has passed since batch creation
[Producer clientId=client-id2, transactionalId=transactional-id] Transiting to
abortable error state due to org.apache.kafka.common.errors.TimeoutException:
Expiring 1 record(s) for topic-1:120026 ms has passed since batch creation
[Producer clientId=client-id2, transactionalId=transactional-id] Aborting
incomplete transaction
[Producer clientId=client-id2, transactionalId=transactional-id] Invoking
InitProducerId with current producer ID and epoch
ProducerIdAndEpoch(producerId=191799, epoch=0) in order to bump the epoch
[Producer clientId=client-id2, transactionalId=transactional-id] ProducerId set
to 191799 with epoch 1
[Producer clientId=client-id2, transactionalId=transactional-id] Transiting to
abortable error state due to org.apache.kafka.common.errors.NetworkException:
Disconnected from node 4
[Producer clientId=client-id2, transactionalId=transactional-id] Transiting to
abortable error state due to org.apache.kafka.common.errors.TimeoutException:
The request timed out.
[Producer clientId=client-id2, transactionalId=transactional-id] Uncaught error
in request completion:
java.lang.IllegalStateException: TransactionalId transactional-id: Invalid
transition attempted from state READY to state ABORTABLE_ERROR
at
org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:1089)
at
org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:508)
at
org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:734)
at
org.apache.kafka.clients.producer.internals.TransactionManager.handleFailedBatch(TransactionManager.java:739)
at
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:753)
at
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:743)
at
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:695)
at
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:634)
at
org.apache.kafka.clients.producer.internals.Sender.lambda$null$1(Sender.java:575)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
at
org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:562)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at
org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:562)
at
org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$5(Sender.java:836)
at
org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
at
org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:583)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:575)
at
org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:328)
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243)
at java.base/java.lang.Thread.run(Thread.java:829)
{code}
The producer hits timeouts which cause it to abort an active transaction. After
aborting, the producer bumps its epoch, which transitions it back to the
`READY` state. Following this, there are two errors for inflight requests,
which cause an illegal state transition to `ABORTABLE_ERROR`. But how could the
transaction ABORT complete if there were still inflight requests?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)