Luke Chen created KAFKA-18911:
---------------------------------
Summary: alterPartition gets stuck when getting out-of-date errors
Key: KAFKA-18911
URL: https://issues.apache.org/jira/browse/KAFKA-18911
Project: Kafka
Issue Type: Bug
Affects Versions: 3.9.0
Reporter: Luke Chen
Assignee: Luke Chen
When the leader node sends the AlterPartition request to the controller, the
controller will do [some
validation|https://github.com/apache/kafka/blob/898dcd11ad260e9b3cfefc5291c40e68009acb7d/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L1231]
before processing it. And in the leader node side, when receiving the errors,
we'll decide if it should be retried or not
[here|https://github.com/apache/kafka/blob/898dcd11ad260e9b3cfefc5291c40e68009acb7d/core/src/main/scala/kafka/cluster/Partition.scala#L1868].
However, in some non-retry cases, we directly return false without changing
the state:
{code:java}
case Errors.UNKNOWN_TOPIC_OR_PARTITION =>
info(s"Failed to alter partition to $proposedIsrState since the controller
doesn't know about " +
"this topic or partition. Partition state may be out of sync, awaiting new
the latest metadata.")
false
case Errors.UNKNOWN_TOPIC_ID =>
info(s"Failed to alter partition to $proposedIsrState since the controller
doesn't know about " +
"this topic. Partition state may be out of sync, awaiting new the latest
metadata.")
false
case Errors.FENCED_LEADER_EPOCH =>
info(s"Failed to alter partition to $proposedIsrState since the leader epoch
is old. " +
"Partition state may be out of sync, awaiting new the latest metadata.")
false
case Errors.INVALID_UPDATE_VERSION =>
info(s"Failed to alter partition to $proposedIsrState because the partition
epoch is invalid. " +
"Partition state may be out of sync, awaiting new the latest metadata.")
false
case Errors.INVALID_REQUEST =>
info(s"Failed to alter partition to $proposedIsrState because the request is
invalid. " +
"Partition state may be out of sync, awaiting new the latest metadata.")
false
case Errors.NEW_LEADER_ELECTED =>
// The operation completed successfully but this replica got removed from the
replica set by the controller
// while completing a ongoing reassignment. This replica is no longer the
leader but it does not know it
// yet. It should remain in the current pending state until the metadata
overrides it.
// This is only raised in KRaft mode.
info(s"The alter partition request successfully updated the partition state
to $proposedIsrState but " +
"this replica got removed from the replica set while completing a
reassignment. " +
"Waiting on new metadata to clean up this replica.")
false{code}
As we said in the log, "Partition state may be out of sync, awaiting new the
latest metadata". But without updating the partition state means it will stays
at `PendingExpandIsr` or `PendingShrinkIsr` state, which keeps the `isInflight`
to true. Under this state, the partition state will never be updated anymore.
The impact of this issue is that the ISR state will be in stale(wrong) state
until leadership change.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)