[jira] [Updated] (KAFKA-1860) File system errors are not detected unless Kafka tries to write

2016-02-01 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1860:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> File system errors are not detected unless Kafka tries to write
> ---
>
> Key: KAFKA-1860
> URL: https://issues.apache.org/jira/browse/KAFKA-1860
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Mayuresh Gharat
> Fix For: 0.9.1.0
>
> Attachments: KAFKA-1860.patch
>
>
> When the disk (raid with caches dir) dies on a Kafka broker, typically the 
> filesystem gets mounted into read-only mode, and hence when Kafka tries to 
> read the disk, they'll get a FileNotFoundException with the read-only errno 
> set (EROFS).
> However, as long as there is no produce request received, hence no writes 
> attempted on the disks, Kafka will not exit on such FATAL error (when the 
> disk starts working again, Kafka might think some files are gone while they 
> will reappear later as raid comes back online). Instead it keeps spilling 
> exceptions like:
> {code}
> 2015/01/07 09:47:41.543 ERROR [KafkaScheduler] [kafka-scheduler-1] 
> [kafka-server] [] Uncaught exception in scheduled task 
> 'kafka-recovery-point-checkpoint'
> java.io.FileNotFoundException: 
> /export/content/kafka/i001_caches/recovery-point-offset-checkpoint.tmp 
> (Read-only file system)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:206)
>   at java.io.FileOutputStream.(FileOutputStream.java:156)
>   at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1860) File system errors are not detected unless Kafka tries to write

2015-03-17 Thread Mayuresh Gharat (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayuresh Gharat updated KAFKA-1860:
---
Status: Patch Available  (was: Open)

 File system errors are not detected unless Kafka tries to write
 ---

 Key: KAFKA-1860
 URL: https://issues.apache.org/jira/browse/KAFKA-1860
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Mayuresh Gharat
 Fix For: 0.9.0

 Attachments: KAFKA-1860.patch


 When the disk (raid with caches dir) dies on a Kafka broker, typically the 
 filesystem gets mounted into read-only mode, and hence when Kafka tries to 
 read the disk, they'll get a FileNotFoundException with the read-only errno 
 set (EROFS).
 However, as long as there is no produce request received, hence no writes 
 attempted on the disks, Kafka will not exit on such FATAL error (when the 
 disk starts working again, Kafka might think some files are gone while they 
 will reappear later as raid comes back online). Instead it keeps spilling 
 exceptions like:
 {code}
 2015/01/07 09:47:41.543 ERROR [KafkaScheduler] [kafka-scheduler-1] 
 [kafka-server] [] Uncaught exception in scheduled task 
 'kafka-recovery-point-checkpoint'
 java.io.FileNotFoundException: 
 /export/content/kafka/i001_caches/recovery-point-offset-checkpoint.tmp 
 (Read-only file system)
   at java.io.FileOutputStream.open(Native Method)
   at java.io.FileOutputStream.init(FileOutputStream.java:206)
   at java.io.FileOutputStream.init(FileOutputStream.java:156)
   at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)