junrao commented on code in PR #15621:
URL: https://github.com/apache/kafka/pull/15621#discussion_r1548427028


##########
clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java:
##########
@@ -240,25 +240,54 @@ public MemoryRecords build() {
         return builtRecords;
     }
 
+
     /**
-     * Get the max timestamp and its offset. The details of the offset 
returned are a bit subtle.
-     * Note: The semantic for the offset of max timestamp is the first offset 
with the max timestamp if there are multi-records having same timestamp.
-     *
-     * If the log append time is used, the offset will be the first offset of 
the record.
-     *
-     * If create time is used, the offset will always be the offset of the 
record with the max timestamp.
-     *
-     * If it's NO_TIMESTAMP (i.e. MAGIC_VALUE_V0), we'll return offset -1 
since no timestamp info in records.
-     *
-     * @return The max timestamp and its offset
+     * There are three cases of finding max timestamp to return:
+     * 1) version 0: The max timestamp is NO_TIMESTAMP (-1)
+     * 2) LogAppendTime: All records have same timestamp, and so the max 
timestamp is equal to logAppendTime
+     * 3) CreateTime: The max timestamp of record
+     * <p>
+     * Let's talk about OffsetOfMaxTimestamp. There are some paths that we 
don't try to find the OffsetOfMaxTimestamp
+     * to avoid expensive records iteration. Those paths include follower 
append and index recovery. In order to
+     * avoid inconsistent time index, we let all paths find 
shallowOffsetOfMaxTimestamp instead of OffsetOfMaxTimestamp.
+     * <p>
+     * Let's define the shallowOffsetOfMaxTimestamp: It is last offset of the 
batch having max timestamp. If there are
+     * many batches having same max timestamp, we pick up the earliest batch.
+     * <p>
+     * There are five cases of finding shallowOffsetOfMaxTimestamp to return:
+     * 1) version 0: It is always the -1
+     * 2) LogAppendTime with single batch: It is the offset of last record
+     * 3) LogAppendTime with many single-record batches: Those single-record 
batches have same max timestamp, so the
+     *                                                   base offset is equal 
with the last offset of earliest batch

Review Comment:
   so the base offset is equal with the last offset of earliest batch => so we 
return the base offset, which is equal to the last offset of earliest batch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to