[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059305#comment-17059305 ] zhangchenghui commented on KAFKA-4972: -- I made an analysis of this problem specifically: https://objcoding.com/2020/03/14/kafka-invalid-offset-exception/ > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961220#comment-16961220 ] Stoyan Stoyanov commented on KAFKA-4972: Seen in version 2.2.1 > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929298#comment-16929298 ] Sri Vishnu commented on KAFKA-4972: --- Hi all, we had a similar issue when we were restarting our brokers. Turns out, for us, it was an issue with the {{systemd}} configuration. We have 350 GB of data on each broker with 150 topics and shutting down the Kafka server needs about 8 minutes. However, {{systemd}} was configured to wait only 90 seconds for the server to shutdown and then its force kills the server. When the server is restarted, it will end up having corrupted index file because its didn't shutdown properly. The fix was to set the {{TimeoutStopSec=600}} config in systemd configuration. We summarised the issue and the fix in a blog post: [https://blog.experteer.engineering/kafka-corrupted-index-file-warnings-after-broker-restart.html] Hopefully, it is helpful for some of you. > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679867#comment-16679867 ] Nenad Maric commented on KAFKA-4972: Any news about this bug? We have a similar problem on Kafka 1.1.0. Here is the log output: {code:java} [2018-11-08 10:45:04,471] WARN [Log partition=, dir=/data] Found a corrupted index file corresponding to log file /data//06723263.log due to Corrupt index found, index file (/data//06723263.index) has non-zero size but the last offset is 6723263 which is no greater than the base offset 6723263.}, recovering segment and rebuilding index files... (kafka.log.Log){code} {code:java} [2018-11-08 10:46:28,351] ERROR There was an error in one of the threads during logs loading: java.lang.IllegalArgumentException: inconsistent range (kafka.log.LogManager) [2018-11-08 10:46:28,356] ERROR [KafkaServer id=4] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.lang.IllegalArgumentException: inconsistent range at java.util.concurrent.ConcurrentSkipListMap$SubMap.(ConcurrentSkipListMap.java:2620) at java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2078) at java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2114) at kafka.log.Log$$anonfun$12.apply(Log.scala:1561) at kafka.log.Log$$anonfun$12.apply(Log.scala:1560) at scala.Option.map(Option.scala:146) at kafka.log.Log.logSegments(Log.scala:1560) at kafka.log.Log.kafka$log$Log$$recoverSegment(Log.scala:358) at kafka.log.Log$$anonfun$completeSwapOperations$1.apply(Log.scala:389) at kafka.log.Log$$anonfun$completeSwapOperations$1.apply(Log.scala:380) at scala.collection.immutable.Set$Set1.foreach(Set.scala:94) at kafka.log.Log.completeSwapOperations(Log.scala:380) at kafka.log.Log.loadSegments(Log.scala:408) at kafka.log.Log.(Log.scala:216) at kafka.log.Log$.apply(Log.scala:1747) at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:255) at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$11$$anonfun$apply$15$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:335) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2018-11-08 10:46:28,402] INFO [KafkaServer id=4] shutting down (kafka.server.KafkaServer){code} > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335027#comment-16335027 ] Ewen Cheslack-Postava commented on KAFKA-4972: -- [~ijuma] anything to do here or are we at a loss until we have more info? I'd like to bump out of this release (and maybe just remove fixVersion entirely if we don't even know what the issue is). > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Fix For: 1.0.1 > > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191714#comment-16191714 ] ASF GitHub Bot commented on KAFKA-4972: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/4016 > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Labels: reliability > Fix For: 0.11.0.2, 1.0.1 > > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191337#comment-16191337 ] Ismael Juma commented on KAFKA-4972: There is no thread safety issue after all. I clarified it in the code. So, this needs more investigation. > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Fix For: 0.11.0.2, 1.0.1 > > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191321#comment-16191321 ] ASF GitHub Bot commented on KAFKA-4972: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/4016 MINOR: Simplify log cleaner and fix compiler warnings - Simplify LogCleaner.cleanSegments and add comment regarding thread unsafe usage of `LogSegment.append`. This was a result of investigating KAFKA-4972. - Fix compiler warnings. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka simplify-log-cleaner-and-fix-warnings Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4016.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4016 commit 3b26b21c4a41b9857d48a09a63a560228924df4f Author: Ismael Juma Date: 2017-10-04T13:57:03Z Simplify LogCleaner.cleanSegments and add comment regarding thread unsafe usage of `LogSegment.append` commit a1e50d8fbffc977646397f0446efeaa798816d87 Author: Ismael Juma Date: 2017-10-04T13:57:20Z Fix compiler warnings > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Fix For: 1.0.0, 0.11.0.2 > > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189588#comment-16189588 ] Ismael Juma commented on KAFKA-4972: I am not sure if it's the root cause, but it's a good observation that locking seems to be missing. We should fix this for 1.0.0. > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Fix For: 1.0.0, 0.11.0.2 > > Attachments: Snap3.png > > > -deleted text-After force shutdown all kafka brokers one by one, restart them > one by one, but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > ~I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) ~ > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup
[ https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16169681#comment-16169681 ] Julius Žaromskis commented on KAFKA-4972: - There's a bunch of warning msgs in my log file, kafka is slow to restart {{[2017-09-18 06:53:19,349] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/var/kafka/dispatch.task-ack-6/00021796.index) has non-zero size but the last offset is 21796 which is no larger than the base offset 21796.}. deleting /var/kafka/dispatch.task-ack-6/00021796.timeindex, /var/kafka/dispatch.task-ack-6/00021796.index, and /var/kafka/dispatch.task-ack-6/00021796.txnindex and rebuilding index... (kafka.log.Log) [2017-09-18 06:56:10,533] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/var/kafka/dispatch.task-ack-10/00027244.index) has non-zero size but the last offset is 27244 which is no larger than the base offset 27244.}. deleting /var/kafka/dispatch.task-ack-10/00027244.timeindex, /var/kafka/dispatch.task-ack-10/00027244.index, and /var/kafka/dispatch.task-ack-10/00027244.txnindex and rebuilding index... (kafka.log.Log) [2017-09-18 07:09:17,710] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/var/kafka/dispatch.status-3/49362755.index) has non-zero size but the last offset is 49362755 which is no larger than the base offset 49362755.}. deleting /var/kafka/dispatch.status-3/49362755.timeindex, /var/kafka/dispatch.status-3/49362755.index, and /var/kafka/dispatch.status-3/49362755.txnindex and rebuilding index... (kafka.log.Log)}} > Kafka 0.10.0 Found a corrupted index file during Kafka broker startup > -- > > Key: KAFKA-4972 > URL: https://issues.apache.org/jira/browse/KAFKA-4972 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 0.10.0.0 > Environment: JDK: HotSpot x64 1.7.0_80 > Tag: 0.10.0 >Reporter: fangjinuo >Priority: Critical > Fix For: 0.11.0.2 > > Attachments: Snap3.png > > > After force shutdown all kafka brokers one by one, restart them one by one, > but a broker startup failure. > The following WARN leval log was found in the log file: > found a corrutped index file, .index , delet it ... > you can view details by following attachment. > I look up some codes in core module, found out : > the nonthreadsafe method LogSegment.append(offset, messages) has tow caller: > 1) Log.append(messages) // here has a synchronized > lock > 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, > messageFormatVersion) // here has not > So I guess this may be the reason for the repeated offset in 0xx.log file > (logsegment's .log) > Although this is just my inference, but I hope that this problem can be > quickly repaired -- This message was sent by Atlassian JIRA (v6.4.14#64029)