Roland Sommer created KAFKA-20295:
-------------------------------------
Summary: Removed controllers still in metadata, blocking
finalizing upgrade to 4.2.0
Key: KAFKA-20295
URL: https://issues.apache.org/jira/browse/KAFKA-20295
Project: Kafka
Issue Type: Bug
Components: controller
Environment: Kafka 4.2.0 (Scala 2.13) running on Debian Trixie 13.3
Reporter: Roland Sommer
While upgrading our kafka clusters to new operating systems I switched to
dynamic voter configuration and removed controller instances with
`/opt/kafka/bin/kafka-metadata-quorum.sh` and the `remove-controller`
subcommand. Inspecting the cluster with `describe` only shows the actual
running nodes.
Now during the update to 4.2.0, the final metadata upgrade step complains about
```
Could not upgrade eligible.leader.replicas.version to 1. The update failed for
all features since the following feature had an error: Invalid update version
29 for feature metadata.version. Controller 351 only supports versions 7-27
```
with 351 being an ID of an already removed controller. Inspecting a snapshot
with `/opt/kafka/bin/kafka-metadata-shell.sh` indeed shows all controller ids
of already removed controllers:
```
>> ls image/cluster/controllers/
158 206 351 584 611 686
```
while other tools only show the expected nodes:
```
~$ /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-controller
localhost:9093 describe --replication --human-readable
NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp
LastCaughtUpTimestamp Status
158 2gsvOvnT7urpZcA_-LUy5w 196823524 0 7 ms ago
8 ms ago Leader
611 27Ii-xdAZ7ReQBLsvvJb0A 196823524 0 348 ms ago
348 ms ago Follower
206 Q7X9o3XbKxk_3tz4T8torg 196823524 0 348 ms ago
348 ms ago Follower
226 7n6aedUEuytkqhBnbe7ESw 196823524 0 348 ms ago
348 ms ago Observer
181 tZ17VQ8cYpf7R-LyAQWf2w 196823524 0 349 ms ago
349 ms ago Observer
299 P4qXt3K0G5Qg_7w_UdvaNA 196823524 0 348 ms ago
348 ms ago Observer
290 bA0pqZFsUa45lRTB6bS4bg 196823524 0 348 ms ago
348 ms ago Observer
293 Av_12222lURKVYVt-aNKOQ 196823524 0 348 ms ago
348 ms ago Observer
485 glENIgkIng1MYDF8HxxoDQ 196823524 0 349 ms ago
350 ms ago Observer
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)