Please let me know if anyone can help me understanding the above behaviour.

On Fri, Nov 10, 2023, 00:08 Arpit Goyal <goyal.arpit...@gmail.com> wrote:

> Recently I am trying to understand the fetch offset mechanism of kafka
> through code but I have certain doubts which I am still not able to
> understand.
>
> *What I believe Log Segment contains *
>
>  Log Segment constitutes a list of record batches with key as base offset.
> Let's take an  example and list segments.
> *Segment 1 *
> base offset 50
> List of Record Batch with start offset  and last offset
> 1. (50,56) RB1
> 2. (57,62) RB2
> 3. (65,92) RB3
> *Offset Index (baseoffset(50),relative offset( 0) , position(234))*
> *Segment 2*
> base offset 93
> List of Record Batch with start offset  and last offset
> 1. (93,98) RB1
> 2. (99,102) RB2
> 3. (103,105) RB3
>
>
> Process of fetching the data with >= targetOffset
> Lets say targetOffset = 60
>
> 1. We first try to find the segment whose baseoffset is the largest  one
> but lesser or equal  than the target Offset(
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L396).
> In the above case it would return *Segment 1*.
>
> 2. Reading the segment
> <https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L421>
> 3.Then we look up the Offset Index and try to find the largest offset
> lesser or equal to the targetOffset.In the translate Offset we execute the
> index look up. Code line
> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L394>.
> It would return mapping which contains offset and position i.e 50,234
> 4. Using the startposition *234*  and the targetOffset 60 , We try to
> execute the function searchForOffsetWithSize
> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L316>
> which returns the batch whose last offset >= targetOffset.
> 5. According to the  code, It will return the RecordBatch 2 of Segment 1
> i.e. RB2(57,62)  because 62>=60.
> 6. We return this logoffsetposition
> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L320>(
> 62, batch positon , batch size)
>
> *My questions *
> 1. Batch position and batch size corresponds to *segment 1 RB2*. The
> position of RB2 starts from 57 , then why are we sending the last
> offset(62) in the batch position.
> 2. In the code after fetching the logOffsetPosition
> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L434>
>  I
> have not seen any usage of the last offset value returned of a batch , but
> I see usage of the position value which would be pointing to offset 57.
> 3. According to the algorithm, we are sending log data which starts from
> the 57th offset position instead of 60th offset position. Is it not
> breaching the contract where we want to send log data >= target Offset
> Can anyone help me identify the gap in understanding of what I am missing
> here.
>
>
> Thanks and Regards
> Arpit Goyal
> 8861094754
>

Reply via email to