Re: [PR] KAFKA-15046: Get rid of unnecessary fsyncs inside UnifiedLog.lock to stabilize performance [kafka]

via GitHub Sat, 18 Nov 2023 12:16:01 -0800


ocadaruma commented on code in PR #14242:
URL: https://github.com/apache/kafka/pull/14242#discussion_r1398265953



##########
storage/src/main/java/org/apache/kafka/storage/internals/epoch/LeaderEpochFileCache.java:
##########
@@ -308,7 +308,14 @@ public void truncateFromEnd(long endOffset) {
             if (endOffset >= 0 && epochEntry.isPresent() && 
epochEntry.get().startOffset >= endOffset) {
                 List<EpochEntry> removedEntries = removeFromEnd(x -> 
x.startOffset >= endOffset);
 
-                flush();
+                // We intentionally don't force flushing change to the device 
here because:
+                // - To avoid fsync latency
+                //   * fsync latency could be huge on a disk glitch, which is 
not rare in spinning drives
+                //   * This method is called by ReplicaFetcher threads, which 
could block replica fetching
+                //     then causing ISR shrink or high produce response time 
degradation in remote scope on high fsync latency.
+                // - Even when stale epochs remained in LeaderEpoch file due 
to the unclean shutdown, it will be handled by
+                //   another truncateFromEnd call on log loading procedure so 
it won't be a problem
+                flush(false);

Review Comment:
   I take another look at the code and found that flushing to the file (without 
fsync) is necessary.
   
   The point here is if there's any code path that reloads the leader-epoch 
cache from the file.
   I found it's possible, so not flushing could be a problem in below scenario
   - (1) AlterReplicaDir is initiated
   - (2) truncation happens on futureLog
       * LeaderEpochFileCache.truncateFromEnd is called, but it isn't flushed 
to the file
   - (3) future log caught up and the 
[renameDir](https://github.com/apache/kafka/blob/3.6.0/core/src/main/scala/kafka/log/UnifiedLog.scala#L681)
 is called
       * This will reload the leader-epoch cache from the file, which is stale
       * Then wrong leader-epoch may be returned (e.g. for list-offsets request)
   
   So we still should flush to the file even without fsync.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-15046: Get rid of unnecessary fsyncs inside UnifiedLog.lock to stabilize performance [kafka]

Reply via email to