ocadaruma commented on code in PR #15993: URL: https://github.com/apache/kafka/pull/15993#discussion_r1625200232
########## storage/src/main/java/org/apache/kafka/storage/internals/epoch/LeaderEpochFileCache.java: ########## @@ -305,22 +341,23 @@ public Map.Entry<Integer, Long> endOffsetFor(int requestedEpoch, long logEndOffs /** * Removes all epoch entries from the store with start offsets greater than or equal to the passed offset. + * <p> + * Checkpoint-flushing is done asynchronously. */ public void truncateFromEnd(long endOffset) { lock.writeLock().lock(); try { - Optional<EpochEntry> epochEntry = latestEntry(); - if (endOffset >= 0 && epochEntry.isPresent() && epochEntry.get().startOffset >= endOffset) { - List<EpochEntry> removedEntries = removeFromEnd(x -> x.startOffset >= endOffset); - - // We intentionally don't force flushing change to the device here because: + List<EpochEntry> removedEntries = truncateFromEnd(epochs, endOffset); + if (!removedEntries.isEmpty()) { + // We flush the change to the device in the background because: // - To avoid fsync latency // * fsync latency could be huge on a disk glitch, which is not rare in spinning drives // * This method is called by ReplicaFetcher threads, which could block replica fetching // then causing ISR shrink or high produce response time degradation in remote scope on high fsync latency. - // - Even when stale epochs remained in LeaderEpoch file due to the unclean shutdown, it will be handled by - // another truncateFromEnd call on log loading procedure so it won't be a problem - writeToFile(false); + // - We still flush the change in #assign synchronously, meaning that it's guaranteed that the checkpoint file always has no missing entries. + // * Even when stale epochs are restored from the checkpoint file after the unclean shutdown, it will be handled by + // another truncateFromEnd call on log loading procedure, so it won't be a problem + scheduler.scheduleOnce("leader-epoch-cache-flush-" + topicPartition, this::writeToFileForTruncation); Review Comment: Not strictly necessary. I just followed some of existing scheduled tasks's convention (e.g. https://github.com/apache/kafka/blob/c24f94936d4370b2a3f3bfad42c56198208079b4/core/src/main/scala/kafka/coordinator/transaction/TransactionStateManager.scala#L571) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org