Hi Colin,

Thanks for your support with getting this over the line and that’s
great re the preliminary pass! Thanks also for sharing your
thoughts, I've had a careful look at each of these and sharing my
comments below.

I agree, it is important to avoid a perf hit on non-JBOD.
I've opted for tagged fields in:

 - Assignment.Directory in PartitionRecord and PartitionChangeRecord
 - OnlineLogDirs in RegisterBrokerRecord,
   BrokerRegistrationChangeRecord and BrokerRegistrationRequest
 - OfflineLogDirs in BrokerHeartbeatRequest

I don't think we should use UUID.Zero to refer to the first log
directory because this value also indicates "unknown, or no log dir
yet assigned". We can work with the default value by gating on JBOD
configuration — determined by log.dirs (Broker side) and
BrokerRegistration (Controller side). In non-JBOD:
 - The single logdir won't have a UUID
 - BrokerRegistration doesn't list any log dir UUIDs
 - AssignReplicasToDirs is never used

Directory reassignment will work the same way as in ZK mode, but
with the difference that the promotion of the future replica
requires an AssignReplicasToDirs request to update the assignment.
I've tried to improve the description of this operation and
included a diagram to illustrate it.

I've renamed LogDirsOfflined to OfflineLogDirs in the
BrokerHeartbeatRequest. This field was named differently because
it's only used for log directories that have become offline but are
not yet represented as offline in the metadata, from the Broker's
point of view — as opposed to always listing the full set of offline
log dirs.

I don't think we should identify log directories using system
paths, because those may be arbitrary. A set of storage devices may
be mounted and re-mounted on the same set of system paths using a
different order every time. Kafka only cares about the content, not
the location of the log directories.

I think I have overcomplicated this by trying to identify offline
log directories. In ZK mode we don't care to do this, and we
shouldn't do it in KRaft either. What we need to know is if there
are any offline log directories, to prevent re-streaming the offline
replicas into the remaining online log dirs. In ZK mode, the 'isNew'
flag is used to prevent the Broker from creating partitions when any
logdir is offline unless they're new. In KRaft the Controller can
reset the assignment to UUID.Zero for replicas in log dirs not
listed in the broker registration only when the broker registration
indicates no offline log dirs. So I've updated the KIP to:
 - Remove directory.ids from meta.properties
 - Change OfflineLogDirs in RegisterBrokerRecord,
   BrokerRegistrationChangeRecord and BrokerRegistrationRequest
   to a boolean
 - Describe this behavior in the Controller in Broker

It’s been a lot of work to get here and I’m similarly excited to get
this released soon! The vote has been open over the last week
and I'm happy to give it another two to get any other thoughts without
rushing. Thanks again for the support and input!

Best,

--
Igor


Reply via email to