We are using Kafka 0.8.2.1 and the old producer. When there is an issue
with a broker machine (we are not completely sure what the issue is, though
it generally looks like host down), sometimes it takes some of the
producers about 15 minutes to time out. Our producer settings are
request.required.acks=1
message.send.max.retries=5
retry.backoff.ms=5000
compression.codec=gzip
Custom metadata.broker.list, serializer.class, key.serializer.class,
partitioner.class

After ~15 minutes the following exception is emitted, then things continue
on as usual
15/08/06 07:47:25 WARN Partition 233 async.DefaultEventHandler 89: Failed
to send producer request with correlation id 2345860 to broker 3 with data
for partitions [<TOPIC_NAME>,238],[<TOPIC_NAME>,228]
java.io.IOException: Connection timed out
       at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
       at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
       at sun.nio.ch.IOUtil.write(IOUtil.java:148)
       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
       at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
       at
kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
       at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
       at
kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
       at kafka.network.BlockingChannel.send(BlockingChannel.scala:103)
       at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73)
       at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
       at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103)
       at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
       at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
       at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
       at
kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102)
       at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
       at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
       at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
       at kafka.producer.SyncProducer.send(SyncProducer.scala:101)
       at
kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
       at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
       at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
       at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
       at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
       at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
       at
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
       at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
       at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
       at
kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:100)
       at
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72)
       at kafka.producer.Producer.send(Producer.scala:77)
       at kafka.javaapi.producer.Producer.send(Producer.scala:42)
       at
com.dropbox.vortex.kafka.KafkaStatEmitter.sendMessages(KafkaStatEmitter.java:230)
       at
com.dropbox.vortex.kafka.KafkaStatEmitter.batchSend(KafkaStatEmitter.java:213)
       at
com.dropbox.vortex.aggregator.TagAggregatorWorker.flushSlots(TagAggregatorWorker.java:376)
       at
com.dropbox.vortex.aggregator.TagAggregatorWorker.flush(TagAggregatorWorker.java:306)
       at
com.dropbox.vortex.aggregator.TagAggregatorWorker.run(TagAggregatorWorker.java:228)
       at
com.dropbox.vortex.cli.ThreadPoolWorkScheduler$3.run(ThreadPoolWorkScheduler.java:106)
       at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)

Other producers in the same process fail much more quickly with different
stack traces.

Reply via email to