[ 
https://issues.apache.org/jira/browse/KAFKA-19397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17973501#comment-17973501
 ] 

Lucas Brutschy commented on KAFKA-19397:
----------------------------------------

Note that there is some further analysis in the comments of 
[https://github.com/apache/kafka/pull/15968] - This problem seems to occur if 
topics are deleted and recreated, or during rebootstrap of the client.

For Kafka Streams, this NPE is fatal, since it will cause the producer to get 
stuck here:
```
    at 
java.util.concurrent.locks.LockSupport.park(java.base@17.0.12/LockSupport.java:211)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.12/AbstractQueuedSynchronizer.java:715)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.12/AbstractQueuedSynchronizer.java:1047)
    at 
java.util.concurrent.CountDownLatch.await(java.base@17.0.12/CountDownLatch.java:230)
    at 
org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
    at 
org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:1075)
    at 
org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:1325)

```

So one after the other, all stream instances come get stuck during flush until 
no progress is being made anymore.

> TransactionManager.handleCompletedBatch throws NPE
> --------------------------------------------------
>
>                 Key: KAFKA-19397
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19397
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 4.1.0
>            Reporter: Lucas Brutschy
>            Assignee: Omnia Ibrahim
>            Priority: Blocker
>         Attachments: streams.log
>
>
> Sometimes, current trunk throws the following NPE:
>  
> {code:java}
> [2025-05-29 04:06:05,855] ERROR [kafka-producer-network-thread | 
> i-07bbab180f6062ba3-StreamThread-3-producer] [Producer 
> clientId=i-07bbab180f6062ba3-StreamThread-3-producer] Uncaught error in 
> request completion: (org.apache.kafka.clients.NetworkClient)
> java.lang.NullPointerException: Cannot read field "topicPartition" because 
> "batch" is null
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager.handleCompletedBatch(TransactionManager.java:748)
> at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:736)
> at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:710)
> at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:613)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
> at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$3(Sender.java:597)
> at java.base/java.lang.Iterable.forEach(Iterable.java:75)
> at 
> org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:597)
> at 
> org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$9(Sender.java:895)
> at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154)
> at 
> org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:669)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:661)
> at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:340)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242)
> at java.base/java.lang.Thread.run(Thread.java:840)
>  
>  
> {code}
>  
> This was discovered in a long-running test, so we do not have a directly 
> reproducible test case. However, DEBUG logs are included below, which show 
> the sequence of METADATA and PRODUCE requests / responses that seem to cause 
> this.
>  
> Likely cause is the change here: [https://github.com/apache/kafka/pull/15968]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to