[ https://issues.apache.org/jira/browse/KAFKA-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390468#comment-16390468 ]
ASF GitHub Bot commented on KAFKA-6530: --------------------------------------- dhruvilshah3 opened a new pull request #4660: KAFKA-6530: Use actual first offset of message set when rolling log segment URL: https://github.com/apache/kafka/pull/4660 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* Use the exact first offset of message set when rolling log segment. This is possible to do for message format V2 and beyond without any performance penalty, because we have the first offset stored in the header. This augments the fix made in KAFKA-4451 to avoid using the heuristic for V2 and beyond messages. *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* Added unit tests to simulate cases where segment needs to roll because of overflow in index offsets. Verified that the new segment created in these cases uses the first offset, instead of the heuristic in use previously. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use actual first offset of messages when rolling log segment for magic v2 > ------------------------------------------------------------------------- > > Key: KAFKA-6530 > URL: https://issues.apache.org/jira/browse/KAFKA-6530 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Dhruvil Shah > Priority: Major > > We've implemented a heuristic to avoid overflowing when rolling a log segment > to determine the base offset of the next segment without decompressing the > message set to find the actual first offset. With the v2 message format, we > can find the first offset without needing decompression, so we can set the > correct base offset exactly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)