[
https://issues.apache.org/jira/browse/KAFKA-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385866#comment-15385866
]
ASF GitHub Bot commented on KAFKA-3915:
---------------------------------------
GitHub user ijuma opened a pull request:
https://github.com/apache/kafka/pull/1643
KAFKA-3915; Don't convert messages from v0 to v1 during log compaction
The conversion is unsafe as the converted message size may be greater
than the message size limit. Updated `LogCleanerIntegrationTest` to test
the max message size case for both V0 and the current version.
Also include a few minor clean-ups:
* Remove unused expression
* Avoid unintentional usage of `scala.collection.immutable.Stream` (`toSeq`
on an `Iterator`)
* Add explicit result type in `FileMessageSet.iterator`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ijuma/kafka
kafka-3915-log-cleaner-io-buffers-message-conversion
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/1643.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1643
----
commit 083029d12a12ca0e750dddf08a4ce8f4ec5db8bb
Author: Ismael Juma <[email protected]>
Date: 2016-07-20T13:34:04Z
Don't convert messages from version 0 to version 1 during log compaction
The conversion is unsafe as the converted message size may be greater
than the message size limit.
commit 1262c2f87f6dd65c8624dde7f3406de7ab00cb99
Author: Ismael Juma <[email protected]>
Date: 2016-07-20T13:35:47Z
Remove unused expression, avoid usage of scala.Stream and use explicit
return type for public method
----
> LogCleaner IO buffers do not account for potential size difference due to
> message format change
> -----------------------------------------------------------------------------------------------
>
> Key: KAFKA-3915
> URL: https://issues.apache.org/jira/browse/KAFKA-3915
> Project: Kafka
> Issue Type: Bug
> Components: log
> Affects Versions: 0.10.0.0
> Reporter: Tommy Becker
> Assignee: Ismael Juma
> Priority: Blocker
> Fix For: 0.10.0.1
>
>
> We are upgrading from Kafka 0.8.1 to 0.10.0.0 and discovered an issue after
> getting the following exception from the log cleaner:
> {code}
> [2016-06-28 10:02:18,759] ERROR [kafka-log-cleaner-thread-0], Error due to
> (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
> at
> kafka.message.ByteBufferMessageSet$.writeMessage(ByteBufferMessageSet.scala:169)
> at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:435)
> at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:429)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:429)
> at
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:380)
> at
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:376)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:376)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:343)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:342)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.clean(LogCleaner.scala:342)
> at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:237)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:215)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> {code}
> At first this seems impossible because the input and output buffers are
> identically sized. But in the case where the source messages are of an older
> format, additional space may be required to write them out in the new one.
> Since the message header is 8 bytes larger in 0.10.0, this failure can
> happen.
> We're planning to work around this by adding the following config:
> {code}log.message.format.version=0.8.1{code} but this definitely needs a fix.
> We could simply preserve the existing message format (since in this case we
> can't retroactively add a timestamp anyway). Otherwise, the log cleaner would
> have to be smarter about ensuring there is sufficient "slack space" in the
> output buffer to account for the size difference * the number of messages in
> the input buffer.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)