[ 
https://issues.apache.org/jira/browse/KAFKA-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593812#comment-14593812
 ] 

Mayuresh Gharat edited comment on KAFKA-2012 at 6/19/15 7:30 PM:
-----------------------------------------------------------------

Discussed this with [~jjkoshy]. This patch seems like a workaround and does not 
actually tell us why the file got corrupted in first place. We can probably 
have a config that can turn this code path ON or OFF, so that we can actually 
investigate when this happens. 
Let me know, I can open another ticket or use this : 
https://issues.apache.org/jira/browse/KAFKA-1554 to add that config.

This was discussed in KAFKA-1554 :

Joel Koshy added a comment - 14/Mar/15 01:10
That would be a work-around, but ideally we should figure out why it happened 
in the first place.

 Jun Rao added a comment - 09/Apr/15 02:06
Yes, I am not sure if auto fixing the index is better. People then may not 
realize if there is an issue. It would be better to figure out what's causing 
this.


Thanks,

Mayuresh



was (Author: mgharat):
Discussed this with [~jjkoshy]. This patch seems like a workaround and does not 
actually tell us why the file got corrupted in first place. We can probably 
have a config that can turn this code path ON or OFF, so that we can actually 
investigate when this happens. 
Let me know, I can open another ticket or use this : 
https://issues.apache.org/jira/browse/KAFKA-1554 to add that config.

Thanks,

Mayuresh


> Broker should automatically handle corrupt index files
> ------------------------------------------------------
>
>                 Key: KAFKA-2012
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2012
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.1.1
>            Reporter: Todd Palino
>            Assignee: Manikumar Reddy
>             Fix For: 0.8.3
>
>         Attachments: KAFKA-2012.patch, KAFKA-2012_2015-06-19_18:55:11.patch, 
> KAFKA-2012_2015-06-19_21:09:22.patch
>
>
> We had a bunch of unclean system shutdowns (power failure), which caused 
> corruption on our disks holding log segments in many cases. While the broker 
> is handling the log segment corruption properly (truncation), it is having 
> problems with corruption in the index files. Additionally, this only seems to 
> be happening on some startups (while we are upgrading).
> The broker should just do what I do when I hit a corrupt index file - remove 
> it and rebuild it.
> 2015/03/09 17:58:53.873 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
> Fatal error during KafkaServerStartable startup. Prepare to shutdown
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file 
> (/export/content/kafka/i001_caches/__consumer_offsets-39/00000000000000000000.index)
>  has non-zero size but the last offset is -2121629628 and the base offset is 0
>       at scala.Predef$.require(Predef.scala:233)
>       at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
>       at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:185)
>       at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:184)
>       at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>       at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>       at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>       at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>       at kafka.log.Log.loadSegments(Log.scala:184)
>       at kafka.log.Log.<init>(Log.scala:82)
>       at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$7$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:141)
>       at kafka.utils.Utils$$anon$1.run(Utils.scala:54)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to