[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2020-03-14 Thread zhangchenghui (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059305#comment-17059305
 ] 

zhangchenghui commented on KAFKA-4972:
--

I made an analysis of this problem specifically:
https://objcoding.com/2020/03/14/kafka-invalid-offset-exception/

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2019-10-28 Thread Stoyan Stoyanov (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961220#comment-16961220
 ] 

Stoyan Stoyanov commented on KAFKA-4972:


Seen in version 2.2.1

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2019-09-13 Thread Sri Vishnu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929298#comment-16929298
 ] 

Sri Vishnu commented on KAFKA-4972:
---

Hi all, 

we had a similar issue when we were restarting our brokers. Turns out, for us, 
it was an issue with the {{systemd}} configuration. 

We have 350 GB of data on each broker with 150 topics and shutting down the 
Kafka server needs about 8 minutes. However, {{systemd}} was configured to wait 
only 90 seconds for the server to shutdown and then its force kills the server. 
When the server is restarted, it will end up having corrupted index file 
because its didn't shutdown properly. The fix was to set the 
{{TimeoutStopSec=600}} config in systemd configuration. 

We summarised the issue and the fix in a blog post: 
[https://blog.experteer.engineering/kafka-corrupted-index-file-warnings-after-broker-restart.html]

Hopefully, it is helpful for some of you.

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2018-11-08 Thread Nenad Maric (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679867#comment-16679867
 ] 

Nenad Maric commented on KAFKA-4972:


Any news about this bug?
We have a similar problem on Kafka 1.1.0. Here is the log output:
{code:java}
[2018-11-08 10:45:04,471] WARN [Log partition=, dir=/data] 
Found a corrupted index file corresponding to log file 
/data//06723263.log due to Corrupt index 
found, index file (/data//06723263.index) has 
non-zero size but the last offset is 6723263 which is no greater than the base 
offset 6723263.}, recovering segment and rebuilding index files... 
(kafka.log.Log){code}
{code:java}
[2018-11-08 10:46:28,351] ERROR There was an error in one of the threads during 
logs loading: java.lang.IllegalArgumentException: inconsistent range 
(kafka.log.LogManager)
[2018-11-08 10:46:28,356] ERROR [KafkaServer id=4] Fatal error during 
KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.IllegalArgumentException: inconsistent range
 at 
java.util.concurrent.ConcurrentSkipListMap$SubMap.(ConcurrentSkipListMap.java:2620)
 at 
java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2078)
 at 
java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2114)
 at kafka.log.Log$$anonfun$12.apply(Log.scala:1561)
 at kafka.log.Log$$anonfun$12.apply(Log.scala:1560)
 at scala.Option.map(Option.scala:146)
 at kafka.log.Log.logSegments(Log.scala:1560)
 at kafka.log.Log.kafka$log$Log$$recoverSegment(Log.scala:358)
 at kafka.log.Log$$anonfun$completeSwapOperations$1.apply(Log.scala:389)
 at kafka.log.Log$$anonfun$completeSwapOperations$1.apply(Log.scala:380)
 at scala.collection.immutable.Set$Set1.foreach(Set.scala:94)
 at kafka.log.Log.completeSwapOperations(Log.scala:380)
 at kafka.log.Log.loadSegments(Log.scala:408)
 at kafka.log.Log.(Log.scala:216)
 at kafka.log.Log$.apply(Log.scala:1747)
 at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:255)
 at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$11$$anonfun$apply$15$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:335)
 at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
[2018-11-08 10:46:28,402] INFO [KafkaServer id=4] shutting down 
(kafka.server.KafkaServer){code}

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2018-01-22 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335027#comment-16335027
 ] 

Ewen Cheslack-Postava commented on KAFKA-4972:
--

[~ijuma] anything to do here or are we at a loss until we have more info? I'd 
like to bump out of this release (and maybe just remove fixVersion entirely if 
we don't even know what the issue is).

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Fix For: 1.0.1
>
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2017-10-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191714#comment-16191714
 ] 

ASF GitHub Bot commented on KAFKA-4972:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4016


> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.2, 1.0.1
>
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2017-10-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191337#comment-16191337
 ] 

Ismael Juma commented on KAFKA-4972:


There is no thread safety issue after all. I clarified it in the code. So, this 
needs more investigation.

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
> Fix For: 0.11.0.2, 1.0.1
>
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2017-10-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191321#comment-16191321
 ] 

ASF GitHub Bot commented on KAFKA-4972:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/4016

MINOR: Simplify log cleaner and fix compiler warnings

- Simplify LogCleaner.cleanSegments and add comment regarding thread
unsafe usage of `LogSegment.append`. This was a result of investigating
KAFKA-4972.
- Fix compiler warnings.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
simplify-log-cleaner-and-fix-warnings

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4016.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4016


commit 3b26b21c4a41b9857d48a09a63a560228924df4f
Author: Ismael Juma 
Date:   2017-10-04T13:57:03Z

Simplify LogCleaner.cleanSegments and add comment regarding thread unsafe 
usage of `LogSegment.append`

commit a1e50d8fbffc977646397f0446efeaa798816d87
Author: Ismael Juma 
Date:   2017-10-04T13:57:20Z

Fix compiler warnings




> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
> Fix For: 1.0.0, 0.11.0.2
>
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2017-10-03 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189588#comment-16189588
 ] 

Ismael Juma commented on KAFKA-4972:


I am not sure if it's the root cause, but it's a good observation that locking 
seems to be missing. We should fix this for 1.0.0.

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
> Fix For: 1.0.0, 0.11.0.2
>
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2017-09-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169681#comment-16169681
 ] 

Julius Žaromskis commented on KAFKA-4972:
-

There's a bunch of warning msgs in my log file, kafka is slow to restart

{{[2017-09-18 06:53:19,349] WARN Found a corrupted index file due to 
requirement failed: Corrupt index found, index file 
(/var/kafka/dispatch.task-ack-6/00021796.index) has non-zero size 
but the last offset is 21796 which is no larger than the base offset 21796.}. 
deleting /var/kafka/dispatch.task-ack-6/00021796.timeindex, 
/var/kafka/dispatch.task-ack-6/00021796.index, and 
/var/kafka/dispatch.task-ack-6/00021796.txnindex and rebuilding 
index... (kafka.log.Log)
[2017-09-18 06:56:10,533] WARN Found a corrupted index file due to requirement 
failed: Corrupt index found, index file 
(/var/kafka/dispatch.task-ack-10/00027244.index) has non-zero size 
but the last offset is 27244 which is no larger than the base offset 27244.}. 
deleting /var/kafka/dispatch.task-ack-10/00027244.timeindex, 
/var/kafka/dispatch.task-ack-10/00027244.index, and 
/var/kafka/dispatch.task-ack-10/00027244.txnindex and rebuilding 
index... (kafka.log.Log)
[2017-09-18 07:09:17,710] WARN Found a corrupted index file due to requirement 
failed: Corrupt index found, index file 
(/var/kafka/dispatch.status-3/49362755.index) has non-zero size but 
the last offset is 49362755 which is no larger than the base offset 49362755.}. 
deleting /var/kafka/dispatch.status-3/49362755.timeindex, 
/var/kafka/dispatch.status-3/49362755.index, and 
/var/kafka/dispatch.status-3/49362755.txnindex and rebuilding 
index... (kafka.log.Log)}}

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
> Fix For: 0.11.0.2
>
> Attachments: Snap3.png
>
>
> After force shutdown all kafka brokers one by one, restart them one by one, 
> but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) 
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)