Tim Stilwell created KAFKA-600:
----------------------------------
Summary: kafka should respond gracefully rather than crash when
unable to write due to ENOSPC
Key: KAFKA-600
URL: https://issues.apache.org/jira/browse/KAFKA-600
Project: Kafka
Issue Type: Bug
Components: core
Reporter: Tim Stilwell
problem:
user starts kafka with log.dir value set to a small partition and begins
writing data to the mq. when the disk partition is full, kafka crashes. given
that this product is used for both reading and writing operations, crashing
seems rather drastic even if the error message is helpful. something more
robust would be appreciated. perhaps, logging an error and rejecting
additional write requests while accepting additional read requests? perhaps,
sending an email alert to Operations? at least shutdown gracefully so the user
is aware that received messages were saved with a helpful message providing
some details of the last message received. when tens or hundreds of thousands
of messages can be processed in a second, it isn't helpful to merely log a
timestamp and crash.
steps to reproduce:
1) download and install kafka
2) modify server.properties
# vi /opt/kafka-0.7.2-incubating-src/config/server.properties
set log.dir="/var/log/kafka"
3) modify log4j
# vi /opt/kafka-0.7.2-incubating-src/config/log4j.properties
set fileAppender.File=/var/log/kafka/kafka-request.log
4) start kafka service
$ sudo bash
# ulimit -c unlimited
# /opt/kafka-0.7.2-incubating-src/bin/kafka-server-start.sh
/opt/kafka-0.7.2-incubating-src/config/server.properties &
6) begin writing data to hostname:9092
7) review /var/log/kafka-request.log
results:
$ grep log.dir /opt/kafka-0.7.2-incubating-src/config/server.properties
log.dir=/var/log/kafka
$ df -h /var/log/kafka
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 4.0G 4.0G 0 100% /
$ tail /var/log/kafka/kafka-request.log
17627442 [ZkClient-EventThread-14-10.0.20.242:2181] INFO
kafka.server.KafkaZooKeeper - Begin registering broker topic
/brokers/topics/raw/0 with 1 partitions
17627444 [ZkClient-EventThread-14-10.0.20.242:2181] INFO
kafka.server.KafkaZooKeeper - End registering broker topic
/brokers/topics/raw/0
17627445 [ZkClient-EventThread-14-10.0.20.242:2181] INFO
kafka.server.KafkaZooKeeper - done re-registering broker
18337676 [kafka-processor-3] ERROR kafka.network.Processor - Closing socket
for /10.0.20.138 because of error
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218)
at sun.nio.ch.IOUtil.read(IOUtil.java:191)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
at kafka.utils.Utils$.read(Utils.scala:538)
at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Processor.read(SocketServer.scala:311)
at kafka.network.Processor.run(SocketServer.scala:214)
at java.lang.Thread.run(Thread.java:722)
18391974 [kafka-processor-4] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18422004 [kafka-processor-5] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18434563 [kafka-processor-6] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18485005 [kafka-processor-7] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18497083 [kafka-processor-0] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18525720 [kafka-processor-1] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18543843 [kafka-processor-2] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18563230 [kafka-processor-4] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18575613 [kafka-processor-5] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.138.
18677568 [kafka-processor-6] ERROR kafka.network.Processor - Closing socket
for /10.0.20.138 because of error
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218)
at sun.nio.ch.IOUtil.read(IOUtil.java:191)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
at kafka.utils.Utils$.read(Utils.scala:538)
at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Processor.read(SocketServer.scala:311)
at kafka.network.Processor.run(SocketServer.scala:214)
at java.lang.Thread.run(Thread.java:722)
18828016 [kafka-processor-7] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.248.
18844274 [kafka-processor-0] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.248.
18849691 [kafka-processor-1] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.248.
18896883 [kafka-processor-2] INFO kafka.network.Processor - Closing socket
connection to /10.0.20.248.
22383195 [kafka-processor-2] FATAL kafka.log.Log - Halting due to
unrecoverable I/O error while handling producer request
java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:59)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89)
at sun.nio.ch.IOUtil.write(IOUtil.java:60)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:195)
at
kafka.message.ByteBufferMessageSet.writeTo(ByteBufferMessageSet.scala:76)
at kafka.message.FileMessageSet.append(FileMessageSet.scala:159)
at kafka.log.LogSegment.append(Log.scala:105)
at kafka.log.Log.liftedTree1$1(Log.scala:246)
at kafka.log.Log.append(Log.scala:242)
at
kafka.server.KafkaRequestHandlers.kafka$server$KafkaRequestHandlers$$handleProducerRequest(KafkaRequestHandlers.scala:69)
at
kafka.server.KafkaRequestHandlers.handleProducerRequest(KafkaRequestHandlers.scala:53)
at
kafka.server.KafkaRequestHandlers$$anonfun$handlerFor$1.apply(KafkaRequestHandlers.scala:38)
at
kafka.server.KafkaRequestHandlers$$anonfun$handlerFor$1.apply(KafkaRequestHandlers.scala:38)
at kafka.network.Processor.handle(SocketServer.scala:296)
at kafka.network.Processor.read(SocketServer.scala:319)
at kafka.network.Processor.run(SocketServer.scala:214)
at java.lang.Thread.run(Thread.java:722)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira