Does it mean that currently if a producer publishes an uncompressed message
to the server which has local log format configured to compressed,
consumers will receive compressed messages when fetching?
 On Jul 19, 2012 5:08 PM, "Jay Kreps (JIRA)" <j...@apache.org> wrote:

>
>     [
> https://issues.apache.org/jira/browse/KAFKA-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418395#comment-13418395]
>
> Jay Kreps commented on KAFKA-406:
> ---------------------------------
>
> Oh yes, and the other design requirement we had was that messages not be
> re-compressed on a fetch request. A simple implementation that didn't have
> this requirement would just be to have the consumer request N messages, and
> either specify to compress or not, and have the server read these into
> memory, decompress if its local log format is comrpessed, and then batch
> compress exactly the messages the client asked for, and send just that. The
> problem with this is that we have about a 5x read-to-write ratio so
> recompressing on each read is now recompressing the same stuff 5 times on
> average. This makes consumption way more expensive. I don't think this is a
> hard requirement but to make that approach fly we would have to demonstrate
> that the cpu overhead of compression would not become a serious bottleneck.
> I know this won't work with GZIP, but it might be possible to do it with
> snappy or a faster compression algo.
>
> > Gzipped payload is a fully wrapped Message (with headers), not just
> payload
> >
> ---------------------------------------------------------------------------
> >
> >                 Key: KAFKA-406
> >                 URL: https://issues.apache.org/jira/browse/KAFKA-406
> >             Project: Kafka
> >          Issue Type: Bug
> >          Components: core
> >    Affects Versions: 0.7.1
> >         Environment: N/A
> >            Reporter: Lorenzo Alberton
> >
> > When creating a gzipped MessageSet, the collection of Messages is passed
> to CompressionUtils.compress(), where each message is serialised [1] into a
> buffer (not just the payload, the full Message with headers, uncompressed),
> then gripped, and finally wrapped into another Message [2].
> > In other words, the consumer has to unwrap the Message flagged as
> gzipped, unzip the payload, and unwrap the unzipped payload again as a
> non-compressed Message.
> > Is this double-wrapping the intended behaviour?
> > [1] messages.foreach(m => m.serializeTo(messageByteBuffer))
> > [2] new Message(outputStream.toByteArray, compressionCodec)
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

Reply via email to