Can anybody help in understanding the above scenario or any related KIP
link would be helpful.
Thanks and Regards
Arpit Goyal
8861094754


On Fri, Nov 10, 2023 at 12:46 AM Arpit Goyal <goyal.arpit...@gmail.com>
wrote:

> Please let me know if anyone can help me understanding the above
> behaviour.
>
> On Fri, Nov 10, 2023, 00:08 Arpit Goyal <goyal.arpit...@gmail.com> wrote:
>
>> Recently I am trying to understand the fetch offset mechanism of kafka
>> through code but I have certain doubts which I am still not able to
>> understand.
>>
>> *What I believe Log Segment contains *
>>
>>  Log Segment constitutes a list of record batches with key as base
>> offset. Let's take an  example and list segments.
>> *Segment 1 *
>> base offset 50
>> List of Record Batch with start offset  and last offset
>> 1. (50,56) RB1
>> 2. (57,62) RB2
>> 3. (65,92) RB3
>> *Offset Index (baseoffset(50),relative offset( 0) , position(234))*
>> *Segment 2*
>> base offset 93
>> List of Record Batch with start offset  and last offset
>> 1. (93,98) RB1
>> 2. (99,102) RB2
>> 3. (103,105) RB3
>>
>>
>> Process of fetching the data with >= targetOffset
>> Lets say targetOffset = 60
>>
>> 1. We first try to find the segment whose baseoffset is the largest  one
>> but lesser or equal  than the target Offset(
>> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L396).
>> In the above case it would return *Segment 1*.
>>
>> 2. Reading the segment
>> <https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L421>
>> 3.Then we look up the Offset Index and try to find the largest offset
>> lesser or equal to the targetOffset.In the translate Offset we execute the
>> index look up. Code line
>> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L394>.
>> It would return mapping which contains offset and position i.e 50,234
>> 4. Using the startposition *234*  and the targetOffset 60 , We try to
>> execute the function searchForOffsetWithSize
>> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L316>
>> which returns the batch whose last offset >= targetOffset.
>> 5. According to the  code, It will return the RecordBatch 2 of Segment 1
>> i.e. RB2(57,62)  because 62>=60.
>> 6. We return this logoffsetposition
>> <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L320>(
>> 62, batch positon , batch size)
>>
>> *My questions *
>> 1. Batch position and batch size corresponds to *segment 1 RB2*. The
>> position of RB2 starts from 57 , then why are we sending the last
>> offset(62) in the batch position.
>> 2. In the code after fetching the logOffsetPosition
>> <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L434>
>>  I
>> have not seen any usage of the last offset value returned of a batch , but
>> I see usage of the position value which would be pointing to offset 57.
>> 3. According to the algorithm, we are sending log data which starts from
>> the 57th offset position instead of 60th offset position. Is it not
>> breaching the contract where we want to send log data >= target Offset
>> Can anyone help me identify the gap in understanding of what I am missing
>> here.
>>
>>
>> Thanks and Regards
>> Arpit Goyal
>> 8861094754
>>
>

Reply via email to