[
https://issues.apache.org/jira/browse/KAFKA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001441#comment-15001441
]
ASF GitHub Bot commented on KAFKA-2813:
---------------------------------------
GitHub user junrao opened a pull request:
https://github.com/apache/kafka/pull/501
KAFKA-2813: selector doesn't close socket connection on non-IOExceptions
Patched Selector.poll() to close the connection on any exception.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/junrao/kafka KAFKA-2813
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/501.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #501
----
commit 2a4dfd4d63f3b3383d9ce01fce7c2be151ef9f78
Author: Jun Rao <[email protected]>
Date: 2015-11-12T01:11:49Z
KAFKA-2813: selector doesn't close socket connection on non-IOExceptions
----
> selector doesn't close socket connection on non-IOExceptions
> ------------------------------------------------------------
>
> Key: KAFKA-2813
> URL: https://issues.apache.org/jira/browse/KAFKA-2813
> Project: Kafka
> Issue Type: Bug
> Components: core
> Reporter: Jun Rao
> Assignee: Jun Rao
> Priority: Blocker
> Fix For: 0.9.0.0
>
>
> When running a system test, we saw lots of entries like the following. The
> issue is that when the current leader switches to the follower, we will
> truncate the log in the follower. It's possible there is a concurrent fetch
> request being served at this moment. If this happens, we throw a
> KafkaException when trying to send the fetch response (in FileMessageSet).
> The exception will propagate through Selector.poll(). Selector catches
> IOException and closes the corresponding socket. However, KafkaException is
> not an IOException. Since the socket is not closed, Selector.poll() will keep
> accessing the socket and keep getting the same error.
> [2015-11-11 07:25:01,150] ERROR Processor got uncaught exception.
> (kafka.network.Processor)
> kafka.common.KafkaException: Size of FileMessageSet
> /mnt/kafka-data-logs/test_topic-0/00000000000000000000.log has been truncated
> during write: old size 16368, new size 0
> at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:158)
> at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:77)
> at
> org.apache.kafka.common.network.MultiSend.writeTo(MultiSend.java:81)
> at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:148)
> at
> org.apache.kafka.common.network.MultiSend.writeTo(MultiSend.java:81)
> at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:291)
> at
> org.apache.kafka.common.network.KafkaChannel.send(KafkaChannel.java:165)
> at
> org.apache.kafka.common.network.KafkaChannel.write(KafkaChannel.java:152)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:301)
> at kafka.network.Processor.run(SocketServer.scala:413)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)