[
https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380657#comment-14380657
]
Jun Rao commented on KAFKA-527:
-------------------------------
[~ymatsuda], thanks for the patch. +1 and committed to trunk. Leave this jira
open for the other patch from Guozhang.
> Compression support does numerous byte copies
> ---------------------------------------------
>
> Key: KAFKA-527
> URL: https://issues.apache.org/jira/browse/KAFKA-527
> Project: Kafka
> Issue Type: Bug
> Components: compression
> Reporter: Jay Kreps
> Assignee: Yasuhiro Matsuda
> Priority: Critical
> Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch,
> KAFKA-527_2015-03-16_15:19:29.patch, KAFKA-527_2015-03-19_21:32:24.patch,
> KAFKA-527_2015-03-25_12:08:00.patch, java.hprof.no-compression.txt,
> java.hprof.snappy.text
>
>
> The data path for compressing or decompressing messages is extremely
> inefficient. We do something like 7 (?) complete copies of the data, often
> for simple things like adding a 4 byte size to the front. I am not sure how
> this went by unnoticed.
> This is likely the root cause of the performance issues we saw in doing bulk
> recompression of data in mirror maker.
> The mismatch between the InputStream and OutputStream interfaces and the
> Message/MessageSet interfaces which are based on byte buffers is the cause of
> many of these.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)