[ 
https://issues.apache.org/jira/browse/KAFKA-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492552#comment-13492552
 ] 

Jun Rao commented on KAFKA-546:
-------------------------------

Thanks for patch v3. A couple of more comments:
30. ConsumerIterator.next(): In the following code, to be safe, we need to 
check if localCurrent hasNext in the while loop.
    // reject the messages that have already been consumed
    while (item.offset < currentTopicInfo.getConsumeOffset) {
      item = localCurrent.next()

31. ConsumerIteratorTest: Not sure if the test really does what it intends to. 
To simulate reading from the middle of a compressed messageset, we need to put 
in a consume offset larger than 0 in PartitionTopicInfo, right?
                
> Fix commit() in zk consumer for compressed messages
> ---------------------------------------------------
>
>                 Key: KAFKA-546
>                 URL: https://issues.apache.org/jira/browse/KAFKA-546
>             Project: Kafka
>          Issue Type: New Feature
>    Affects Versions: 0.8
>            Reporter: Jay Kreps
>            Assignee: Swapnil Ghike
>         Attachments: kafka-546-v1.patch, kafka-546-v2.patch, 
> kafka-546-v3.patch
>
>
> In 0.7.x and earlier versions offsets were assigned by the byte location in 
> the file. Because it wasn't possible to directly decompress from the middle 
> of a compressed block, messages inside a compressed message set effectively 
> had no offset. As a result the offset given to the consumer was always the 
> offset of the wrapper message set.
> In 0.8 after the logical offsets patch messages in a compressed set do have 
> offsets. However the server still needs to fetch from the beginning of the 
> compressed messageset (otherwise it can't be decompressed). As a result a 
> commit() which occurs in the middle of a message set will still result in 
> some duplicates.
> This can be fixed in the ConsumerIterator by discarding messages smaller than 
> the fetch offset rather than giving them to the consumer. This will make 
> commit work correctly in the presence of compressed messages (finally).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to