[
https://issues.apache.org/jira/browse/KAFKA-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062409#comment-15062409
]
Mayuresh Gharat commented on KAFKA-1860:
----------------------------------------
Cool.
> File system errors are not detected unless Kafka tries to write
> ---------------------------------------------------------------
>
> Key: KAFKA-1860
> URL: https://issues.apache.org/jira/browse/KAFKA-1860
> Project: Kafka
> Issue Type: Bug
> Reporter: Guozhang Wang
> Assignee: Mayuresh Gharat
> Fix For: 0.10.0.0
>
> Attachments: KAFKA-1860.patch
>
>
> When the disk (raid with caches dir) dies on a Kafka broker, typically the
> filesystem gets mounted into read-only mode, and hence when Kafka tries to
> read the disk, they'll get a FileNotFoundException with the read-only errno
> set (EROFS).
> However, as long as there is no produce request received, hence no writes
> attempted on the disks, Kafka will not exit on such FATAL error (when the
> disk starts working again, Kafka might think some files are gone while they
> will reappear later as raid comes back online). Instead it keeps spilling
> exceptions like:
> {code}
> 2015/01/07 09:47:41.543 ERROR [KafkaScheduler] [kafka-scheduler-1]
> [kafka-server] [] Uncaught exception in scheduled task
> 'kafka-recovery-point-checkpoint'
> java.io.FileNotFoundException:
> /export/content/kafka/i001_caches/recovery-point-offset-checkpoint.tmp
> (Read-only file system)
> at java.io.FileOutputStream.open(Native Method)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:156)
> at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)