Does it mean that currently if a producer publishes an uncompressed message to the server which has local log format configured to compressed, consumers will receive compressed messages when fetching? On Jul 19, 2012 5:08 PM, "Jay Kreps (JIRA)" <j...@apache.org> wrote:
> > [ > https://issues.apache.org/jira/browse/KAFKA-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418395#comment-13418395] > > Jay Kreps commented on KAFKA-406: > --------------------------------- > > Oh yes, and the other design requirement we had was that messages not be > re-compressed on a fetch request. A simple implementation that didn't have > this requirement would just be to have the consumer request N messages, and > either specify to compress or not, and have the server read these into > memory, decompress if its local log format is comrpessed, and then batch > compress exactly the messages the client asked for, and send just that. The > problem with this is that we have about a 5x read-to-write ratio so > recompressing on each read is now recompressing the same stuff 5 times on > average. This makes consumption way more expensive. I don't think this is a > hard requirement but to make that approach fly we would have to demonstrate > that the cpu overhead of compression would not become a serious bottleneck. > I know this won't work with GZIP, but it might be possible to do it with > snappy or a faster compression algo. > > > Gzipped payload is a fully wrapped Message (with headers), not just > payload > > > --------------------------------------------------------------------------- > > > > Key: KAFKA-406 > > URL: https://issues.apache.org/jira/browse/KAFKA-406 > > Project: Kafka > > Issue Type: Bug > > Components: core > > Affects Versions: 0.7.1 > > Environment: N/A > > Reporter: Lorenzo Alberton > > > > When creating a gzipped MessageSet, the collection of Messages is passed > to CompressionUtils.compress(), where each message is serialised [1] into a > buffer (not just the payload, the full Message with headers, uncompressed), > then gripped, and finally wrapped into another Message [2]. > > In other words, the consumer has to unwrap the Message flagged as > gzipped, unzip the payload, and unwrap the unzipped payload again as a > non-compressed Message. > > Is this double-wrapping the intended behaviour? > > [1] messages.foreach(m => m.serializeTo(messageByteBuffer)) > > [2] new Message(outputStream.toByteArray, compressionCodec) > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >