Please let me know if anyone can help me understanding the above behaviour.
On Fri, Nov 10, 2023, 00:08 Arpit Goyal <goyal.arpit...@gmail.com> wrote: > Recently I am trying to understand the fetch offset mechanism of kafka > through code but I have certain doubts which I am still not able to > understand. > > *What I believe Log Segment contains * > > Log Segment constitutes a list of record batches with key as base offset. > Let's take an example and list segments. > *Segment 1 * > base offset 50 > List of Record Batch with start offset and last offset > 1. (50,56) RB1 > 2. (57,62) RB2 > 3. (65,92) RB3 > *Offset Index (baseoffset(50),relative offset( 0) , position(234))* > *Segment 2* > base offset 93 > List of Record Batch with start offset and last offset > 1. (93,98) RB1 > 2. (99,102) RB2 > 3. (103,105) RB3 > > > Process of fetching the data with >= targetOffset > Lets say targetOffset = 60 > > 1. We first try to find the segment whose baseoffset is the largest one > but lesser or equal than the target Offset( > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L396). > In the above case it would return *Segment 1*. > > 2. Reading the segment > <https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LocalLog.scala#L421> > 3.Then we look up the Offset Index and try to find the largest offset > lesser or equal to the targetOffset.In the translate Offset we execute the > index look up. Code line > <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L394>. > It would return mapping which contains offset and position i.e 50,234 > 4. Using the startposition *234* and the targetOffset 60 , We try to > execute the function searchForOffsetWithSize > <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L316> > which returns the batch whose last offset >= targetOffset. > 5. According to the code, It will return the RecordBatch 2 of Segment 1 > i.e. RB2(57,62) because 62>=60. > 6. We return this logoffsetposition > <https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java#L320>( > 62, batch positon , batch size) > > *My questions * > 1. Batch position and batch size corresponds to *segment 1 RB2*. The > position of RB2 starts from 57 , then why are we sending the last > offset(62) in the batch position. > 2. In the code after fetching the logOffsetPosition > <https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L434> > I > have not seen any usage of the last offset value returned of a batch , but > I see usage of the position value which would be pointing to offset 57. > 3. According to the algorithm, we are sending log data which starts from > the 57th offset position instead of 60th offset position. Is it not > breaching the contract where we want to send log data >= target Offset > Can anyone help me identify the gap in understanding of what I am missing > here. > > > Thanks and Regards > Arpit Goyal > 8861094754 >