[ https://issues.apache.org/jira/browse/KAFKA-19397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17973501#comment-17973501 ]
Lucas Brutschy commented on KAFKA-19397: ---------------------------------------- Note that there is some further analysis in the comments of [https://github.com/apache/kafka/pull/15968] - This problem seems to occur if topics are deleted and recreated, or during rebootstrap of the client. For Kafka Streams, this NPE is fatal, since it will cause the producer to get stuck here: ``` at java.util.concurrent.locks.LockSupport.park(java.base@17.0.12/LockSupport.java:211) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.12/AbstractQueuedSynchronizer.java:715) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.12/AbstractQueuedSynchronizer.java:1047) at java.util.concurrent.CountDownLatch.await(java.base@17.0.12/CountDownLatch.java:230) at org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76) at org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:1075) at org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:1325) ``` So one after the other, all stream instances come get stuck during flush until no progress is being made anymore. > TransactionManager.handleCompletedBatch throws NPE > -------------------------------------------------- > > Key: KAFKA-19397 > URL: https://issues.apache.org/jira/browse/KAFKA-19397 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 4.1.0 > Reporter: Lucas Brutschy > Assignee: Omnia Ibrahim > Priority: Blocker > Attachments: streams.log > > > Sometimes, current trunk throws the following NPE: > > {code:java} > [2025-05-29 04:06:05,855] ERROR [kafka-producer-network-thread | > i-07bbab180f6062ba3-StreamThread-3-producer] [Producer > clientId=i-07bbab180f6062ba3-StreamThread-3-producer] Uncaught error in > request completion: (org.apache.kafka.clients.NetworkClient) > java.lang.NullPointerException: Cannot read field "topicPartition" because > "batch" is null > at > org.apache.kafka.clients.producer.internals.TransactionManager.handleCompletedBatch(TransactionManager.java:748) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:736) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:710) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:613) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$3(Sender.java:597) > at java.base/java.lang.Iterable.forEach(Iterable.java:75) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:597) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$9(Sender.java:895) > at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:669) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:661) > at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:340) > at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242) > at java.base/java.lang.Thread.run(Thread.java:840) > > > {code} > > This was discovered in a long-running test, so we do not have a directly > reproducible test case. However, DEBUG logs are included below, which show > the sequence of METADATA and PRODUCE requests / responses that seem to cause > this. > > Likely cause is the change here: [https://github.com/apache/kafka/pull/15968] -- This message was sent by Atlassian Jira (v8.20.10#820010)