[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014807#comment-17014807 ] zhangzhisheng commented on KAFKA-6762: -- This happens in our production env which version is 2.12_0.11.0.3。 this is error info {code:java} // code placeholder {code} [2020-01-13 23:53:50,497] ERROR [kafka-log-cleaner-thread-0]: Error due to (kafka.log.LogCleaner) java.lang.IllegalStateException: This log contains a message larger than maximum allowable size of 100. at kafka.log.Cleaner.growBuffers(LogCleaner.scala:606) at kafka.log.Cleaner.cleanInto(LogCleaner.scala:557) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:444) at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:385) at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:384) at scala.collection.immutable.List.foreach(List.scala:389) at kafka.log.Cleaner.doClean(LogCleaner.scala:384) at kafka.log.Cleaner.clean(LogCleaner.scala:361) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:256) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:236) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) [2020-01-13 23:53:50,498] INFO [kafka-log-cleaner-thread-0]: Stopped (kafka.log.LogCleaner) > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > Attachments: __consumer_offsets-9_.tar.xz > > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003098#comment-17003098 ] Fangbin Sun commented on KAFKA-6762: [~ijuma] This happens in our production env which version is 0.10.2.0, and the above workaround comment by [~ricbartm] didn't work. How can we fix this, any advice? > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > Attachments: __consumer_offsets-9_.tar.xz > > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown > completed (kafka.log.LogCleaner) > {code} > What we know so far is: > * We are unable to reproduce it yet in a consistent manner. > * It only happens in the PRO cluster and not in the PRE cluster for the same > customer (which message payloads are very similar) > * Checking our Kafka logs, it only happened on the internal topics > *__consumer_offsets-** > * When we restart the broker process the log-cleaner starts working again > but it can take between 3 minutes and some hours to die again. > * We workaround it by temporary increasing the message.max.bytes and > replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB). > ** Before message.max.bytes = 1
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488998#comment-16488998 ] Ismael Juma commented on KAFKA-6762: This has been fixed in CP 4.1.1, please check the release notes. It only happens if transactions are being used and there's already a JIRA for it. > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > Attachments: __consumer_offsets-9_.tar.xz > > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown > completed (kafka.log.LogCleaner) > {code} > What we know so far is: > * We are unable to reproduce it yet in a consistent manner. > * It only happens in the PRO cluster and not in the PRE cluster for the same > customer (which message payloads are very similar) > * Checking our Kafka logs, it only happened on the internal topics > *__consumer_offsets-** > * When we restart the broker process the log-cleaner starts working again > but it can take between 3 minutes and some hours to die again. > * We workaround it by temporary increasing the message.max.bytes and > replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB). > ** Before message.max.bytes = 10MB, we tried to mat
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488588#comment-16488588 ] Uwe Eisele commented on KAFKA-6762: --- After a restart of the broker, log-cleaner thread runs in the same problem again. During debugging of the log cleaner we saw that the affected log segement only contains deletable control batches in at least the first 112 bytes. In this case the log cleaner states that zero messages have been read and finally throws the IllegalStateExeption. It seems that the log cleaner can not handle larger chunks of a segment that only contain deletable records. {code:java} 092 class LogCleaner ... { ... 558 private[log] def cleanInto(...) { 595 while (position < sourceRecords.sizeInBytes) { ... 601 sourceRecords.readInto(readBuffer, position) 602 val records = MemoryRecords.readableRecords(readBuffer) ... 604 val result = records.filterTo(topicPartition, logCleanerFilter, writeBuffer, maxLogMessageSize, decompressionBufferSupplier) //in this case an empty result with zero read messages is returned, though the read buffer contains records (as mentioned above all of them are deletable control batches) ... 626 if (readBuffer.limit() > 0 && result.messagesRead == 0) 627growBuffers(maxLogMessageSize) // throws the IllegalStateException if the readBuffer has reached the max messages size 628 } ... {code} The reason that no records are returned in the filter method is the following code block in the MemoryRecords class: {code:java} 141 private static FilterResult filterTo(...) { ... 154 for (MutableRecordBatch batch : batches) { 155 bytesRead += batch.sizeInBytes(); 156 157 BatchRetention batchRetention = filter.checkBatchRetention(batch); 158 if (batchRetention == BatchRetention.DELETE) 159 continue; {code} The segment which caused this problem can be found here [^__consumer_offsets-9_.tar.xz] . > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > Attachments: __consumer_offsets-9_.tar.xz > > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThrea
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486975#comment-16486975 ] Uwe Eisele commented on KAFKA-6762: --- We are experiencing the same problem. The log cleaner threads of brokers which are responsible for the __consumer_offsets-9 partition died because of a IllegalStateException. {code:java} [2018-05-22T10:00:27.431+02:00] INFO Cleaner 0: Beginning cleaning of log __consumer_offsets-9. (kafka.log.LogCleaner) [2018-05-22T10:00:27.431+02:00] INFO Cleaner 0: Building offset map for __consumer_offsets-9... (kafka.log.LogCleaner) [2018-05-22T10:00:27.451+02:00] INFO Cleaner 0: Building offset map for log __consumer_offsets-9 for 8 segments in offset range [95646530, 96708067). (kafka.log.LogCleaner) [2018-05-22T10:00:29.604+02:00] INFO Cleaner 0: Offset map for log __consumer_offsets-9 complete. (kafka.log.LogCleaner) [2018-05-22T10:00:29.604+02:00] INFO Cleaner 0: Cleaning log __consumer_offsets-9 (cleaning prior to Wed May 16 16:38:40 CEST 2018, discarding tombstones prior to Sun May 06 18:36:02 CEST 2018)... (kafka.log.LogCl eaner) [2018-05-22T10:00:29.604+02:00] INFO Cleaner 0: Cleaning segment 0 in log __consumer_offsets-9 (largest timestamp Fri May 04 18:01:11 CEST 2018) into 0, discarding deletes. (kafka.log.LogCleaner) [2018-05-22T10:00:29.614+02:00] INFO Cleaner 0: Growing cleaner I/O buffers from 262144bytes to 524288 bytes. (kafka.log.LogCleaner) [2018-05-22T10:00:29.620+02:00] INFO Cleaner 0: Growing cleaner I/O buffers from 524288bytes to 112 bytes. (kafka.log.LogCleaner) [2018-05-22T10:00:29.639+02:00] ERROR [kafka-log-cleaner-thread-0]: Error due to (kafka.log.LogCleaner) java.lang.IllegalStateException: This log contains a message larger than maximum allowable size of 112. at kafka.log.Cleaner.growBuffers(LogCleaner.scala:675) at kafka.log.Cleaner.cleanInto(LogCleaner.scala:627) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:518) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:462) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:461) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.log.Cleaner.doClean(LogCleaner.scala:461) at kafka.log.Cleaner.clean(LogCleaner.scala:438) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) [2018-05-22T10:00:29.640+02:00] INFO [kafka-log-cleaner-thread-0]: Stopped (kafka.log.LogCleaner) {code} We are running Kafka 1.1.0. > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. >
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465471#comment-16465471 ] Manikumar commented on KAFKA-6762: -- looks like this is related to KAFKA-6834 > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown > completed (kafka.log.LogCleaner) > {code} > What we know so far is: > * We are unable to reproduce it yet in a consistent manner. > * It only happens in the PRO cluster and not in the PRE cluster for the same > customer (which message payloads are very similar) > * Checking our Kafka logs, it only happened on the internal topics > *__consumer_offsets-** > * When we restart the broker process the log-cleaner starts working again > but it can take between 3 minutes and some hours to die again. > * We workaround it by temporary increasing the message.max.bytes and > replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB). > ** Before message.max.bytes = 10MB, we tried to match message.max.size with > the value of replica.fetch.max.size (1048576), but log-cleaned died with the > same error but different limit. > ** This allowed the log-cleaner not to die
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435653#comment-16435653 ] Ricardo Bartolome commented on KAFKA-6762: -- Hello [~yuzhih...@gmail.com]. We enabled the DEBUG on the log-cleaner on a PRE cluster to see impact and today we enabled it in PRO too. Nevertheless, we upgraded to Kafka 1.1.0 this week performing a rolling upgrade (terminating EC2 instances and spinning new ones that replicate data from others) and so far no broker has been affected by this. We are pretty lost on how to reproduce this, and so far we have been unable to reproduce it. If it happens again, I'd expect to see something in the log-cleaner.log. Our change has been as follows: {code} diff -u /etc/kafka/log4j.properties /tmp/log4j.properties --- /etc/kafka/log4j.properties 2018-02-13 14:10:09.0 + +++ /tmp/log4j.properties 2018-04-07 15:39:19.371814641 + @@ -88,7 +88,7 @@ log4j.logger.kafka.controller=INFO, controllerAppender log4j.additivity.kafka.controller=false -log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender +log4j.logger.kafka.log.LogCleaner=DEBUG, cleanerAppender log4j.additivity.kafka.log.LogCleaner=false log4j.logger.state.change.logger=INFO, stateChangeAppender {code} > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner)
[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429416#comment-16429416 ] Ted Yu commented on KAFKA-6762: --- Here is related code: {code} if(readBuffer.capacity >= maxBufferSize || writeBuffer.capacity >= maxBufferSize) throw new IllegalStateException("This log contains a message larger than maximum allowable size of %s.".format(maxBufferSize)) {code} At least we should log the [read | write] buffer capacity in the exception message. This is the next line: {code} val newSize = math.min(this.readBuffer.capacity * 2, maxBufferSize) {code} I think the condition for throwing IllegalStateException should be relaxed: if we cannot grow the buffer by doubling, we can grow buffer by some increment large enough to fulfill log cleaning. Can you turn on DEBUG logging in your cluster ? Maybe we can get more information from the DEBUG log. Thanks > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown > completed (kafka.log.LogCleaner) > {code} > What we know so far is: > * We are unable to reproduce it yet in a consistent manner. > * It only happens in the PRO cluster and not in the