[ https://issues.apache.org/jira/browse/KAFKA-16431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chia-Ping Tsai resolved KAFKA-16431. ------------------------------------ Fix Version/s: (was: 3.7.1) Resolution: Duplicate > Handle log dir failure in hybrid mode > ------------------------------------- > > Key: KAFKA-16431 > URL: https://issues.apache.org/jira/browse/KAFKA-16431 > Project: Kafka > Issue Type: Bug > Components: jbod > Affects Versions: 3.7.0 > Reporter: Igor Soarez > Assignee: Igor Soarez > Priority: Critical > > As part of the KRaft migration, the Controller implements some of the ZK-mode > controller functionality that is employed during the migration in what is > known as "hybrid mode". > In hybrid mode some brokers may still be running in ZK-mode and some brokers > may have already been restarted into KRaft mode. > The ZK-mode Controller implementation in KRaft does not implement the > ZK-based logic to handle directory failures, so it will be unable to re-elect > leaders for partitions led by failed directories. > This leaves a gap for JBOD during the ZK-KRaft migration. And there are two > main ways this can be addressed: > # Implement the ZK-mode functionality to handle failed directories. Like in > ZK-mode, the controller needs to subscribe to events in the > `/log_dir_event_notification` ZNode, and rely on per-partition errors on full > LeaderAndIsr responses to detect directory failures. > # Another, simpler way to address this, would be to have a migrating ZK > broker stop upon any directory failure. This would sacrifice some > availability / operational flexibility, but it may be much more > straightforward to implement in comparison. > Without a solution, a directory failure during the migration may lead to > indefinite partition unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)