junrao commented on code in PR #14242:
URL: https://github.com/apache/kafka/pull/14242#discussion_r1401104692
##########
storage/src/main/java/org/apache/kafka/storage/internals/log/SnapshotFile.java:
##########
@@ -60,10 +60,10 @@ public File file() {
return file;
}
- public void renameTo(String newSuffix) throws IOException {
+ public void renameToDelete(String newSuffix) throws IOException {
Review Comment:
Could we remove `newSuffix` since it's always `DELETED_FILE_SUFFIX`?
##########
storage/src/main/java/org/apache/kafka/storage/internals/epoch/LeaderEpochFileCache.java:
##########
@@ -308,7 +308,14 @@ public void truncateFromEnd(long endOffset) {
if (endOffset >= 0 && epochEntry.isPresent() &&
epochEntry.get().startOffset >= endOffset) {
List<EpochEntry> removedEntries = removeFromEnd(x ->
x.startOffset >= endOffset);
- flush();
+ // We intentionally don't force flushing change to the device
here because:
+ // - To avoid fsync latency
+ // * fsync latency could be huge on a disk glitch, which is
not rare in spinning drives
+ // * This method is called by ReplicaFetcher threads, which
could block replica fetching
+ // then causing ISR shrink or high produce response time
degradation in remote scope on high fsync latency.
+ // - Even when stale epochs remained in LeaderEpoch file due
to the unclean shutdown, it will be handled by
+ // another truncateFromEnd call on log loading procedure so
it won't be a problem
+ flush(false);
Review Comment:
Thanks, @ocadaruma. Good point! So, we could still write to the file without
flushing. The name `flush` implies that it fsyncs to disks. How about renaming
it to sth like `writeToFile`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]