Hi Colin,

Thanks for your questions.
Please have a look at my answers below.

> In the previous email I asked, "who is responsible for assigning replicas to 
> broker directories?" Can you clarify what the answer is to that? If the 
> answer is the controller, there is no need for an "unknown" state for 
> assignments, since the controller can simply choose an assignment immediately 
> when it creates a replica.

Apologies, I thought I had made this clear in my previous email
and in the KIP. It is the Broker who is responsible for
assigning replicas to log directories.

> The broker has access to the same metadata, of course, and already knows what 
> directory a replica is supposed to be in. If that directory doesn't exist, 
> then the replica is offline. There is no need to modify replicas to have an 
> "unknown" assignment state.

You are correct, we can avoid this intermediate metadata update by
the Controller. The Controller doesn't have to reset assignments
to Uuid.ZERO, instead the Broker can just select a new directory
for replicas assigned to other UUIDs when there are no offline dirs.
I have updated the KIP with this change.

However, I still think we need reserved UUIDs.

  a) If there are any offline log directories the broker cannot
  distinguish between replicas that are assigned to offline dirs,
  vs replicas that are assigned to deconfigured directories and
  need new placement.
  Since new replicas are created with UUID.Zero, it is safe for the
  broker to select a directory for them, even if some are offline.
  This replaces the use of the `isNew` flag in ZK mode.

  b) As described in the KIP, under "Handling log directory failures",
  in a race between syncing assignments and failing log directories,
  the Broker might exceptionally need to convey to the Controller
  that some replica that was in the failed directory had an incorrect
  assignment in the metadata, and the Broker can do that with a
  AssignReplicasToDirs using a reserved UUID value.

  b) During a Migration from a ZK JBOD cluster, the Controller
  can enforce fencing of a Broker until all replica assignments
  are known (i.e. not Uuid.ZERO).

I've updated the KIP to name our use of Uuid.ZERO as
Uuid.UnknownDir, and I've introduced Uuid.OfflineDir
to deal with case b).

Let me know what you think.


Best,

--
Igor


Reply via email to