[
https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681680#comment-14681680
]
PC commented on KAFKA-2078:
---------------------------
I can reproduce this bug though it appears to be a challenge to do so.
Running on Mac OS X 10.9.5 16GB Ram
Java version 1.8.0_40
It only appears to affect the Producer;
org.apache.kafka.clients.producer.KafkaProducer 0.8.2.1
Setup:
3 Producers pumping test data to one kafka-server, with 1 replica, all running
locally on the same machine. Each producer using the async
.send(producerRecord, callBack) method.
The configs will be at the bottom of this post.
Here is a log snippet:
16:21:51.527 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer
- PumpSuccess topic: test partition 0 offset: 3330477
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer
- PumpSuccess topic: test partition 0 offset: 3330478
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer
- PumpSuccess topic: test partition 0 offset: 3330479
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer
- PumpSuccess topic: test partition 0 offset: 3330480
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer
- PumpSuccess topic: test partition 0 offset: 3330481
16:26:26.220 [kafka-producer-network-thread | producer-3] WARN
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:26:26.220 [kafka-producer-network-thread | producer-2] WARN
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:26:26.220 [kafka-producer-network-thread | producer-1] WARN
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248)
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
[kafka-clients-0.8.2.1.jar:na]
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
Pay attention to the timestamps. Less than 5 minutes after the producers were
FINISHED pumping the data, these 3 exceptions were logged by the kafka-producer
internals.
The worst is, this bug also occurred while pumping messages to the broker, 2
days ago. The CallBack code was not called for 3 messages ( 1 per producer )
when this bug kicked-in nor was an exception thrown in my application. This can
potentially lead to serious data loss and has severe implications.
I would in a heartbeat upgrade this bug as SEVERE/CRITICAL and not Major.
Temporary (unacceptable) solution is to block with a timeout to ensure we
didn't lose data when this bug manifests itself:
try {
....
kafkaProducer.send(record, callBack).get(5, TimeUnit.SECONDS)
} catch {
....
}
This approach reduces the pumping throughput down to roughly ~5k messages/sec,
from ~60k messages/sec using the async, for a single producer.
Config properties:
Kafka-Server:
broker.id=0
port=9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
#log.flush.interval.messages=10000
log.flush.interval.ms=5000
delete.topic.enable=true
log.retention.hours=2147483640
log.segment.bytes=1073741824
log.retention.check.interval.ms=30000000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=12000
offsets.topic.retention.minutes=28800
offset.metadata.max.bytes=4096
offsets.topic.num.partitions=50
offsets.retention.check.interval.ms=600000
offsets.topic.replication.factor=3
offsets.topic.segment.bytes=104857600
offsets.load.buffer.size=5242880
offsets.commit.required.acks=-1
offsets.commit.timeout.ms=5000
default.replication.factor=1
num.partitions=1
auto.create.topics.enable=true
unclean.leader.election.enable=false
Zookeeper:
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
Producer:
kafkaProducerProps.put(ProducerConfig.ACKS_CONFIG, "1")
kafkaProducerProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"127.0.0.1:9092")
kafkaProducerProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
classOf[StringSerializer].getName)
kafkaProducerProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
classOf[StringSerializer].getName)
Is it possible for anyone to seriously look into this problem? It really does
exist.
> Getting Selector [WARN] Error in I/O with host java.io.EOFException
> -------------------------------------------------------------------
>
> Key: KAFKA-2078
> URL: https://issues.apache.org/jira/browse/KAFKA-2078
> Project: Kafka
> Issue Type: Bug
> Components: producer
> Affects Versions: 0.8.2.0
> Environment: OS Version: 2.6.39-400.209.1.el5uek and Hardware: 8 x
> Intel(R) Xeon(R) CPU X5660 @ 2.80GHz/44GB
> Reporter: Aravind
> Assignee: Jun Rao
>
> When trying to Produce 1000 (10 MB) messages, getting this below error some
> where between 997 to 1000th message. There is no pattern but able to
> reproduce.
> [PDT] 2015-03-31 13:53:50 Selector [WARN] Error in I/O with "our host"
> java.io.EOFException at
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:248) at
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at
> java.lang.Thread.run(Thread.java:724)
> This error I am getting some times @ 997th message or 999th message. There is
> no pattern but able to reproduce.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)