[ 
https://issues.apache.org/jira/browse/KAFKA-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180563#comment-17180563
 ] 

Ismael Juma commented on KAFKA-10421:
-------------------------------------

Also, would you be able to upgrade to 2.5.1 and check if you still see the 
issue?

> Kafka Producer deadlocked on get() call
> ---------------------------------------
>
>                 Key: KAFKA-10421
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10421
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 2.3.0
>         Environment: CentOS7
>            Reporter: Ranadeep Deb
>            Priority: Critical
>
> I have been experiencing a similar issue in 2.3.0
> I have a multi threaded application with each thread sending an individual 
> message to the broker. There are instances where I have observed that the 
> Producer threads get stuck on the Producer.send().get() call. I was not sure 
> what was causing this issue but after landing on this thread 
> (https://issues.apache.org/jira/browse/KAFKA-8135) I am suspecting that 
> intermittent network outage might be the reason. 
> I am curious about how to solve this.
>  
> Following are the stack trace of the Java threads
>  
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.77-b03 mixed 
> mode):Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.77-b03 mixed 
> mode):
>  "Attach Listener" #15081 daemon prio=9 os_prio=0 tid=0x00007f9c50002000 
> nid=0xe572 waiting on condition [0x0000000000000000]   
> java.lang.Thread.State: RUNNABLE
> "pool-14658-thread-9" #15071 prio=5 os_prio=0 tid=0x00007f9c9842f800 
> nid=0x397b waiting on condition [0x00007f9c378fb000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000007703e85b8> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> ava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-8" #15070 prio=5 os_prio=0 tid=0x00007f9c9842e000 
> nid=0x397a waiting on condition [0x00007f9c379fc000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000007704dabb0> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-7" #15069 prio=5 os_prio=0 tid=0x00007f9c9842d800 
> nid=0x3979 waiting on condition [0x00007f9c371f4000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000007705ed590> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-6" #15068 prio=5 os_prio=0 tid=0x00007f9c9842c800 
> nid=0x3978 waiting on condition [0x00007f9c375f8000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x000000077012e2e0> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-5" #15067 prio=5 os_prio=0 tid=0x00007f9c9842c000 
> nid=0x3977 waiting on condition [0x00007f9c372f4000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000007705d7e58> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-4" #15066 prio=5 os_prio=0 tid=0x00007f9c98433000 
> nid=0x3976 waiting on condition [0x00007f9c376f8000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x0000000770320b48> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-3" #15065 prio=5 os_prio=0 tid=0x00007f9c98432800 
> nid=0x3975 waiting on condition [0x00007f9c374f6000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x0000000770281ff0> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  
> "pool-14658-thread-2" #15064 prio=5 os_prio=0 tid=0x00007f9c9857d000 
> nid=0x3974 waiting on condition [0x00007f9c370f2000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x0000000770504cd0> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.senderT100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> "pool-14658-thread-1" #15063 prio=5 os_prio=0 tid=0x00007f9c983fe800 
> nid=0x3973 waiting on condition [0x00007f9c37afd000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000007701f9a60> (a 
> java.util.concurrent.CountDownLatch$Sync) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:76)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at com.t100.sender.T100KafkaProducer.runProducer(T100KafkaProducer.java:104) 
> at com.t100.sender.T100KafkaProducer.run(T100KafkaProducer.java:165) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> "Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f9c981e5800 
> nid=0x13e08 runnable [0x0000000000000000]   java.lang.Thread.State: RUNNABLE
> "C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f9c981c8000 
> nid=0x13e07 waiting on condition [0x0000000000000000]   
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f9c981c6800 
> nid=0x13e06 waiting on condition [0x0000000000000000]   
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f9c981c3800 
> nid=0x13e05 waiting on condition [0x0000000000000000]   
> java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f9c981c1800 
> nid=0x13e04 runnable [0x0000000000000000]   java.lang.Thread.State: RUNNABLE
> "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f9c9818f000 nid=0x13e02 in 
> Object.wait() [0x00007f9c8174f000]   java.lang.Thread.State: WAITING (on 
> object monitor) at java.lang.Object.wait(Native Method) at 
> java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked 
> <0x00000006c805d100> (a java.lang.ref.ReferenceQueue$Lock) at 
> java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at 
> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f9c9818a800 
> nid=0x13e01 in Object.wait() [0x00007f9c81850000]   java.lang.Thread.State: 
> WAITING (on object monitor) at java.lang.Object.wait(Native Method) at 
> java.lang.Object.wait(Object.java:502) at 
> java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked 
> <0x00000006c8061248> (a java.lang.ref.Reference$Lock) at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
> "main" #1 prio=5 os_prio=0 tid=0x00007f9c98009000 nid=0x13df6 waiting on 
> condition [0x00007f9c9fcf1000]   java.lang.Thread.State: TIMED_WAITING 
> (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for  
> <0x000000076fd87ac0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1465)
>  at 
> com.t100.sender.T100TelemetryLogMonitor.uploadFiles(T100TelemetryLogMonitor.java:201)
>  at 
> com.t100.sender.T100TelemetryLogMonitor.main(T100TelemetryLogMonitor.java:114)
> "VM Thread" os_prio=0 tid=0x00007f9c98183000 nid=0x13e00 runnable 
> "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f9c9801e800 nid=0x13df8 
> runnable 
> "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f9c98020800 nid=0x13df9 
> runnable 
> "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f9c98022000 nid=0x13dfa 
> runnable 
> "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f9c98024000 nid=0x13dfb 
> runnable 
> "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f9c98026000 nid=0x13dfc 
> runnable 
> "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f9c98027800 nid=0x13dfd 
> runnable 
> "VM Periodic Task Thread" os_prio=0 tid=0x00007f9c981e8800 nid=0x13e09 
> waiting on condition 
> JNI global references: 556



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to