[jira] [Updated] (KAFKA-20016) Wait until HWM is known before deleting snapshots

Jira Mon, 22 Dec 2025 13:25:06 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-20016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


José Armando García Sancio updated KAFKA-20016:
-----------------------------------------------
    Description: 
If a kraft replica stays offline for a while it is possible to see the 
following error:
{code:java}
org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment the 
log start offset to 126032410 of partition __cluster_metadata-0 since it is 
larger than the high watermark 126013494{code}
This happens because the snapshot cleaning code will execute before the HWM is 
known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check the 
HWM before calling KafkaRaftLog::maybeClean.

Since the HWM is not known, the UnifiedLog will update the HWM to the oldest 
snapshot when that snapshot is deleted and the log start offset is updated:
{code:java}
      private void updateLogStartOffset(long offset) throws IOException {
          logStartOffset = offset;
          if (highWatermark() < offset) {
              updateHighWatermark(offset);
          }
          if (localLog.recoveryPoint() < offset) {
              localLog.updateRecoveryPoint(offset);
          }
      } {code}
When the next snapshot is deleted the following check will fail:
{code:java}
       public boolean maybeIncrementLogStartOffset(long newLogStartOffset, 
LogStartOffsetIncrementReason reason) {
...
          return maybeHandleIOException(
                  () -> "Exception while increasing log start offset for " + 
topicPartition() + " to " + newLogStartOffset + " in dir " + dir().getParent(),
                  () -> {
                      synchronized (lock)  {
                          if (newLogStartOffset > highWatermark()) {
                              throw new OffsetOutOfRangeException("Cannot 
increment the log start offset to " + newLogStartOffset + " of partition " + 
topicPartition() +
                                      " since it is larger than the high 
watermark " + highWatermark());
                          }
...{code}
KRaft has a invariant that snapshot only exist for offset that are less than 
the HWM. I see two solutions to this problem:
 # Don't delete snapshot or segment until the HWM is known. Once the HWM is 
known KRaft will update the HWM tracked by the UnifiedLog.
 # During startup set the HWM in the UnifiedLog to the largest snapshot tracked 
by KRaft.

  was:
If a kraft replica stays offline for a while it is possible to see the 
following error:
{code:java}
org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment the 
log start offset to 126032410 of partition __cluster_metadata-0 since it is 
larger than the high watermark 126013494{code}
This happens because the snapshot cleaning code will execute before the HWM is 
known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check the 
HWM before calling KafkaRaftLog::maybeClean.

Since the HWM is not known, the UnifiedLog will update the HWM to the oldest 
snapshot when that snapshot is deleted and the log start offset is updated:
{code:java}
      private void updateLogStartOffset(long offset) throws IOException {
          logStartOffset = offset;
          if (highWatermark() < offset) {
              updateHighWatermark(offset);
          }
          if (localLog.recoveryPoint() < offset) {
              localLog.updateRecoveryPoint(offset);
          }
      } {code}
When the next snapshot is deleted the following check will fail:
{code:java}
       public boolean maybeIncrementLogStartOffset(long newLogStartOffset, 
LogStartOffsetIncrementReason reason) {
...
          return maybeHandleIOException(
                  () -> "Exception while increasing log start offset for " + 
topicPartition() + " to " + newLogStartOffset + " in dir " + dir().getParent(),
                  () -> {
                      synchronized (lock)  {
                          if (newLogStartOffset > highWatermark()) {
                              throw new OffsetOutOfRangeException("Cannot 
increment the log start offset to " + newLogStartOffset + " of partition " + 
topicPartition() +
                                      " since it is larger than the high 
watermark " + highWatermark());
                          }
...{code}


> Wait until HWM is known before deleting snapshots
> -------------------------------------------------
>
>                 Key: KAFKA-20016
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20016
>             Project: Kafka
>          Issue Type: Bug
>          Components: kraft
>            Reporter: José Armando García Sancio
>            Assignee: José Armando García Sancio
>            Priority: Major
>
> If a kraft replica stays offline for a while it is possible to see the 
> following error:
> {code:java}
> org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment 
> the log start offset to 126032410 of partition __cluster_metadata-0 since it 
> is larger than the high watermark 126013494{code}
> This happens because the snapshot cleaning code will execute before the HWM 
> is known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check 
> the HWM before calling KafkaRaftLog::maybeClean.
> Since the HWM is not known, the UnifiedLog will update the HWM to the oldest 
> snapshot when that snapshot is deleted and the log start offset is updated:
> {code:java}
>       private void updateLogStartOffset(long offset) throws IOException {
>           logStartOffset = offset;
>           if (highWatermark() < offset) {
>               updateHighWatermark(offset);
>           }
>           if (localLog.recoveryPoint() < offset) {
>               localLog.updateRecoveryPoint(offset);
>           }
>       } {code}
> When the next snapshot is deleted the following check will fail:
> {code:java}
>        public boolean maybeIncrementLogStartOffset(long newLogStartOffset, 
> LogStartOffsetIncrementReason reason) {
> ...
>           return maybeHandleIOException(
>                   () -> "Exception while increasing log start offset for " + 
> topicPartition() + " to " + newLogStartOffset + " in dir " + 
> dir().getParent(),
>                   () -> {
>                       synchronized (lock)  {
>                           if (newLogStartOffset > highWatermark()) {
>                               throw new OffsetOutOfRangeException("Cannot 
> increment the log start offset to " + newLogStartOffset + " of partition " + 
> topicPartition() +
>                                       " since it is larger than the high 
> watermark " + highWatermark());
>                           }
> ...{code}
> KRaft has a invariant that snapshot only exist for offset that are less than 
> the HWM. I see two solutions to this problem:
>  # Don't delete snapshot or segment until the HWM is known. Once the HWM is 
> known KRaft will update the HWM tracked by the UnifiedLog.
>  # During startup set the HWM in the UnifiedLog to the largest snapshot 
> tracked by KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-20016) Wait until HWM is known before deleting snapshots

Reply via email to