[
https://issues.apache.org/jira/browse/KAFKA-20016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
José Armando García Sancio updated KAFKA-20016:
-----------------------------------------------
Description:
If a kraft replica stays offline for a while it is possible to see the
following error:
{code:java}
org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment the
log start offset to 126032410 of partition __cluster_metadata-0 since it is
larger than the high watermark 126013494{code}
This happens because the snapshot cleaning code will execute before the HWM is
known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check the
HWM before calling KafkaRaftLog::maybeClean.
Since the HWM is not known, the UnifiedLog will update the HWM to the oldest
snapshot when that snapshot is deleted and the log start offset is updated:
{code:java}
private void updateLogStartOffset(long offset) throws IOException {
logStartOffset = offset;
if (highWatermark() < offset) {
updateHighWatermark(offset);
}
if (localLog.recoveryPoint() < offset) {
localLog.updateRecoveryPoint(offset);
}
} {code}
When the next snapshot is deleted the following check will fail:
{code:java}
public boolean maybeIncrementLogStartOffset(long newLogStartOffset,
LogStartOffsetIncrementReason reason) {
...
return maybeHandleIOException(
() -> "Exception while increasing log start offset for " +
topicPartition() + " to " + newLogStartOffset + " in dir " + dir().getParent(),
() -> {
synchronized (lock) {
if (newLogStartOffset > highWatermark()) {
throw new OffsetOutOfRangeException("Cannot
increment the log start offset to " + newLogStartOffset + " of partition " +
topicPartition() +
" since it is larger than the high
watermark " + highWatermark());
}
...{code}
KRaft has a invariant that snapshot only exist for offset that are less than
the HWM. I see two solutions to this problem:
# Don't delete snapshot or segment until the HWM is known. Once the HWM is
known KRaft will update the HWM tracked by the UnifiedLog.
# During startup set the HWM in the UnifiedLog to the largest snapshot tracked
by KRaft.
was:
If a kraft replica stays offline for a while it is possible to see the
following error:
{code:java}
org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment the
log start offset to 126032410 of partition __cluster_metadata-0 since it is
larger than the high watermark 126013494{code}
This happens because the snapshot cleaning code will execute before the HWM is
known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check the
HWM before calling KafkaRaftLog::maybeClean.
Since the HWM is not known, the UnifiedLog will update the HWM to the oldest
snapshot when that snapshot is deleted and the log start offset is updated:
{code:java}
private void updateLogStartOffset(long offset) throws IOException {
logStartOffset = offset;
if (highWatermark() < offset) {
updateHighWatermark(offset);
}
if (localLog.recoveryPoint() < offset) {
localLog.updateRecoveryPoint(offset);
}
} {code}
When the next snapshot is deleted the following check will fail:
{code:java}
public boolean maybeIncrementLogStartOffset(long newLogStartOffset,
LogStartOffsetIncrementReason reason) {
...
return maybeHandleIOException(
() -> "Exception while increasing log start offset for " +
topicPartition() + " to " + newLogStartOffset + " in dir " + dir().getParent(),
() -> {
synchronized (lock) {
if (newLogStartOffset > highWatermark()) {
throw new OffsetOutOfRangeException("Cannot
increment the log start offset to " + newLogStartOffset + " of partition " +
topicPartition() +
" since it is larger than the high
watermark " + highWatermark());
}
...{code}
> Wait until HWM is known before deleting snapshots
> -------------------------------------------------
>
> Key: KAFKA-20016
> URL: https://issues.apache.org/jira/browse/KAFKA-20016
> Project: Kafka
> Issue Type: Bug
> Components: kraft
> Reporter: José Armando García Sancio
> Assignee: José Armando García Sancio
> Priority: Major
>
> If a kraft replica stays offline for a while it is possible to see the
> following error:
> {code:java}
> org.apache.kafka.common.errors.OffsetOutOfRangeException: Cannot increment
> the log start offset to 126032410 of partition __cluster_metadata-0 since it
> is larger than the high watermark 126013494{code}
> This happens because the snapshot cleaning code will execute before the HWM
> is known. The method RaftMetadataLogCleanerManager::maybeClean doesn't check
> the HWM before calling KafkaRaftLog::maybeClean.
> Since the HWM is not known, the UnifiedLog will update the HWM to the oldest
> snapshot when that snapshot is deleted and the log start offset is updated:
> {code:java}
> private void updateLogStartOffset(long offset) throws IOException {
> logStartOffset = offset;
> if (highWatermark() < offset) {
> updateHighWatermark(offset);
> }
> if (localLog.recoveryPoint() < offset) {
> localLog.updateRecoveryPoint(offset);
> }
> } {code}
> When the next snapshot is deleted the following check will fail:
> {code:java}
> public boolean maybeIncrementLogStartOffset(long newLogStartOffset,
> LogStartOffsetIncrementReason reason) {
> ...
> return maybeHandleIOException(
> () -> "Exception while increasing log start offset for " +
> topicPartition() + " to " + newLogStartOffset + " in dir " +
> dir().getParent(),
> () -> {
> synchronized (lock) {
> if (newLogStartOffset > highWatermark()) {
> throw new OffsetOutOfRangeException("Cannot
> increment the log start offset to " + newLogStartOffset + " of partition " +
> topicPartition() +
> " since it is larger than the high
> watermark " + highWatermark());
> }
> ...{code}
> KRaft has a invariant that snapshot only exist for offset that are less than
> the HWM. I see two solutions to this problem:
> # Don't delete snapshot or segment until the HWM is known. Once the HWM is
> known KRaft will update the HWM tracked by the UnifiedLog.
> # During startup set the HWM in the UnifiedLog to the largest snapshot
> tracked by KRaft.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)