[ 
https://issues.apache.org/jira/browse/KAFKA-16221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-16221.
-------------------------------------
    Resolution: Fixed

> IllegalStateException from Producer
> -----------------------------------
>
>                 Key: KAFKA-16221
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16221
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 3.6.0
>            Reporter: Matthias J. Sax
>            Priority: Critical
>             Fix For: 3.7.0
>
>
> https://issues.apache.org/jira/browse/KAFKA-14831 fixed a producer bug about 
> internal TX state transition and the producer is now throwing an 
> IllegalStateException in situations it did swallow an internal error before.
> This change surfaces a bug in Kafka Streams: Kafka Streams calls 
> `abortTransaction()` blindly when a task is closed dirty, even if the 
> Producer is already in an internal fatal state. However, if the Producer is 
> in a fatal state, Kafka Streams should skip `abortTransaction` and only 
> `close()` the Producer when closing a task dirty.
> The bug is surfaced after `commitTransaction()` did timeout or after an 
> `InvalidProducerEpochException` from a `send()` call, leading to the call to 
> `abortTransaction()` – Kafka Streams does not track right now if a commit-TX 
> is in progress.
> {code:java}
> java.lang.IllegalStateException: Cannot attempt operation `abortTransaction` 
> because the previous call to `commitTransaction` timed out and must be retried
>     at 
> org.apache.kafka.clients.producer.internals.TransactionManager.handleCachedTransactionRequestResult(TransactionManager.java:1203)
>     at 
> org.apache.kafka.clients.producer.internals.TransactionManager.beginAbort(TransactionManager.java:326)
>     at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:274) {code}
> and
> {code:java}
> [2024-01-16 04:19:32,584] ERROR [kafka-producer-network-thread | 
> i-01aea6907970b1bf6-StreamThread-1-producer] stream-thread 
> [i-01aea6907970b1bf6-StreamThread-1] stream-task [1_2] Error encountered 
> sending r       ecord to topic joined-counts for task 1_2 due to:
> org.apache.kafka.common.errors.InvalidProducerEpochException: Producer 
> attempted to produce with an old epoch.
> Written offsets would not be recorded and no more records would be sent since 
> the producer is fenced, indicating the task may be migrated out 
> (org.apache.kafka.streams.processor.internals.RecordCollectorImp       l)
> org.apache.kafka.common.errors.InvalidProducerEpochException: Producer 
> attempted to produce with an old epoch.
> // followed by
> [2024-01-16 04:19:32,587] ERROR [kafka-producer-network-thread | 
> i-01aea6907970b1bf6-StreamThread-1-producer] [Producer 
> clientId=i-01aea6907970b1bf6-StreamThread-1-producer, 
> transactionalId=stream-soak-test       
> -bbb995dc-1ba2-41ed-8791-0512ab4b904d-1] Aborting producer batches due to 
> fatal error (org.apache.kafka.clients.producer.internals.Sender)
> java.lang.IllegalStateException: TransactionalId 
> stream-soak-test-bbb995dc-1ba2-41ed-8791-0512ab4b904d-1: Invalid transition 
> attempted from state FATAL_ERROR to state ABORTABLE_ERROR
>         at 
> org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:996)
>         at 
> org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:451)
>         at 
> org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:664)
>         at 
> org.apache.kafka.clients.producer.internals.TransactionManager.handleFailedBatch(TransactionManager.java:669)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:835)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:819)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:771)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:702)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$null$1(Sender.java:627)
>         at java.util.ArrayList.forEach(ArrayList.java:1259)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:612)
>         at java.lang.Iterable.forEach(Iterable.java:75)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:612)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$8(Sender.java:917)
>         at 
> org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154)
>         at 
> org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:608)
>         at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:600)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:460)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:337)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:252)
>         at java.lang.Thread.run(Thread.java:750) {code}
> If the Producer throws an IllegalStateException on `abortTransaction()` Kafka 
> Streams treats this exception ("correctly") as fatal, and StreamsThread dies. 
> However, Kafka Streams is actually in a state in which it can recover from, 
> and thus should not let StreamThread die by carry forward (by not calling 
> `abortTransaction()` and moving forward with the dirty close of the task).
>  
> It is unclear right now, how 
> https://issues.apache.org/jira/browse/KAFKA-14567 is related – it has a 
> similar stack trace, but it was reported before 
> https://issues.apache.org/jira/browse/KAFKA-14831 was merged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to