Can anybody help in understanding the above scenario or any related KIP link would be helpful. Thanks and Regards Arpit Goyal 8861094754
On Fri, Nov 10, 2023 at 12:46 AM Arpit Goyal <goyal.arpit...@gmail.com> wrote: > Please let me know if anyone can help me understanding the above > behaviour. > > On Fri, Nov 10, 2023, 00:08 Arpit Goyal <goyal.arpit...@gmail.com> wrote: > >> Recently I am trying to understand the fetch offset mechanism of kafka >> through code but I have certain doubts which I am still not able to >> understand. >> >> *What I believe Log Segment contains * >> >> Log Segment constitutes a list of record batches with key as base >> offset. Let's take an example and list segments. >> *Segment 1 * >> base offset 50 >> List of Record Batch with start offset and last offset >> 1. (50,56) RB1 >> 2. (57,62) RB2 >> 3. (65,92) RB3 >> *Offset Index (baseoffset(50),relative offset( 0) , position(234))* >> *Segment 2* >> base offset 93 >> List of Record Batch with start offset and last offset >> 1. (93,98) RB1 >> 2. (99,102) RB2 >> 3. (103,105) RB3 >> >> >> Process of fetching the data with >= targetOffset >> Lets say targetOffset = 60 >> >> 1. We first try to find the segment whose baseoffset is the largest one >> but lesser or equal than the target Offset( >> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L396). >> In the above case it would return *Segment 1*. >> >> 2. Reading the segment >> <https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L421> >> 3.Then we look up the Offset Index and try to find the largest offset >> lesser or equal to the targetOffset.In the translate Offset we execute the >> index look up. Code line >> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L394>. >> It would return mapping which contains offset and position i.e 50,234 >> 4. Using the startposition *234* and the targetOffset 60 , We try to >> execute the function searchForOffsetWithSize >> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L316> >> which returns the batch whose last offset >= targetOffset. >> 5. According to the code, It will return the RecordBatch 2 of Segment 1 >> i.e. RB2(57,62) because 62>=60. >> 6. We return this logoffsetposition >> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L320>( >> 62, batch positon , batch size) >> >> *My questions * >> 1. Batch position and batch size corresponds to *segment 1 RB2*. The >> position of RB2 starts from 57 , then why are we sending the last >> offset(62) in the batch position. >> 2. In the code after fetching the logOffsetPosition >> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L434> >> I >> have not seen any usage of the last offset value returned of a batch , but >> I see usage of the position value which would be pointing to offset 57. >> 3. According to the algorithm, we are sending log data which starts from >> the 57th offset position instead of 60th offset position. Is it not >> breaching the contract where we want to send log data >= target Offset >> Can anyone help me identify the gap in understanding of what I am missing >> here. >> >> >> Thanks and Regards >> Arpit Goyal >> 8861094754 >> >