Hi Igor, I think this KIP can solve the current problems, I have some problems relating to the migration section.
Since we have bumped broker RPC version and metadata record version, there will be some problems between brokers/controllers of different versions. In ZK mode we use IBP as a flag to help solve this, in KRaft mode we use a feature flag(metadata.version) as a flag for using new RPC/metadata or not. Assuming that we are upgrading from 3.3 to 3.4, firstly the finalized version of metadata.version is 3.3, brokers will use version 1 of `BrokerRegistrationRequest` which contains no `OfflineLogDirectories`, finally the finalized version of metadata.version is 3.4, but brokers will no longer send `BrokerRegistrationRequest` unless we restart the broker, so controllers can’t be aware of the `OfflineLogDirectories` of each broker, so we should reconsider the suggestion of Jason to use `BrokerHeartbeatRequest` to communicate `OfflineLogDirectories`. Of course this problem can be solved through a double roll(restart broker twice when upgrading), but we should trying to avoid it since we now have feature flag. One solution is that we include `OfflineLogDirectories` both in `BrokerRegistrationRequest` and `BrokerHeartbeatRequest` requests, when upgrading from 3.3 to 3.4 the `BrokerRegistrationRequest.OfflineLogDirectories` is empty whereas when upgrading from 3.4 to 3.5 it will not be empty. And maybe we can also remove `LogDirectoriesOfflineRequest` you proposed in this KIP. -- Best, Ziming > On Aug 18, 2022, at 11:24 PM, Igor Soarez <i...@soarez.me> wrote: > > Hi Ziming, > > I'm sorry it took me a while to reply. > > Thank you for having a look at this KIP and providing feedback. > >> 1. We have a version field in meta.properties, currently it’s 1, and we can >> set it to 2 in this KIP, and we can give an example of server.properties and >> it’s corresponding meta.properties generated by the storage command tool. > >> 2. When using storage format tool we should specify cluster-id, it will be an >> arduous work if we should manually specify directory-ids for all log dirs, >> I think you can make it more clear about this change that the directory-ids >> are generated automatically. > > Thank you for these suggestions. I've updated the KIP to: > > * include an example > * clarify that the version field would be changed to 2 > * clarify that the log directory UUIDs are automatically generated > >> 3. When controller place a replica, it will select a broker as well as a log >> directory, the latter is currently accomplished in the broker side, so this >> will be a big change? > > I think there can be benefits, as Jason described previously, if we change how > log directories are assigned as follow-up work. > > From a codebase surface area perspective, it is definitely a big change > because there are many models, types and interfaces that assume replicas are > identified solely by a broker id. > That will have to change to include the directory UUID, many lines of code > will > be affected. > > But in terms of behavior it shouldn't be a big change at all. Brokers > currently > select the log directory with the least logs in it. This isn't a very nice > policy, as logs can have wildly different sizes, and log directories can have > different capacities. But it is a policy that the controller can keep. > > If we decide to extend the selection policy and keep it in the broker, > the broker will continue to be able to override the controller's selection > of log directory, using the `AssignReplicasToDirectories` RPC. > >> 4. When handling log directory failures we will update Leader and ISR using >> the existing replica state machine, what is this state machine referring to, >> do you mean the AlterPartition RPC? > > No, I mean that it will use the same rules and mechanism > (`PartitionChangeBuilder`) that is used when a broker is fenced, shutdown or > unregistered. > > I think maybe the term "replica state machine" isn't the very appropriate. > I've updated the KIP to rephrase this section. > > Thanks, > > -- > Igor