[ 
https://issues.apache.org/jira/browse/KAFKA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279396#comment-14279396
 ] 

Alexey Ozeritskiy commented on KAFKA-1804:
------------------------------------------

We've written the simple patch for kafka-network-thread:
{code:java}
  override def run(): Unit = {
    try {
      original_run()
    } catch {
      case e: Throwable => 
        error("ERROR IN NETWORK THREAD: %s".format(e), e)
        Runtime.getRuntime.halt(1)
    }
  }
{code}
and got the following trace:
{code}
[2015-01-15 23:04:08,537] ERROR ERROR IN NETWORK THREAD: 
java.util.NoSuchElementException: None.get (kafka.network.Processor)
java.util.NoSuchElementException: None.get
        at scala.None$.get(Option.scala:313)
        at scala.None$.get(Option.scala:311)
        at kafka.network.ConnectionQuotas.dec(SocketServer.scala:544)
        at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
        at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
        at kafka.network.Processor.close(SocketServer.scala:394)
        at kafka.network.Processor.processNewResponses(SocketServer.scala:426)
        at kafka.network.Processor.iteration(SocketServer.scala:328)
        at kafka.network.Processor.run(SocketServer.scala:381)
        at java.lang.Thread.run(Thread.java:745)
{code}

> Kafka network thread lacks top exception handler
> ------------------------------------------------
>
>                 Key: KAFKA-1804
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1804
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Oleg Golovin
>
> We have faced the problem that some kafka network threads may fail, so that 
> jstack attached to Kafka process showed fewer threads than we had defined in 
> our Kafka configuration. This leads to API requests processed by this thread 
> getting stuck unresponed.
> There were no error messages in the log regarding thread failure.
> We have examined Kafka code to find out there is no top try-catch block in 
> the network thread code, which could at least log possible errors.
> Could you add top-level try-catch block for the network thread, which should 
> recover network thread in case of exception?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to