Hi, Jose,

Thanks for the explanation. Other than depending on KIP-1022 to be
approved, the KIP looks good to me now.

Jun

On Thu, Mar 28, 2024 at 2:56 PM José Armando García Sancio
<jsan...@confluent.io.invalid> wrote:

> Hi Jun,
>
> See my comments below.
>
> On Thu, Mar 28, 2024 at 11:09 AM Jun Rao <j...@confluent.io.invalid> wrote:
> > If I am adding a new voter and it takes a long time (because the new
> voter
> > is catching up), I'd want to know if the request is indeed being
> processed.
> > I thought that's the usage of uncommitted-voter-change.
>
> They can get related information by using the 'kafka-metadata describe
> --replication" command (or the log-end-offset metric from KIP-595).
> That command (and metric) displays the LEO of all of the replicas
> (voters and observers), according to the leader. They can use that
> output to discover if the observer they are trying to add is lagging
> or is not replicating at all.
>
> When the user runs the command above, they don't know the exact offset
> that the new controller needs to reach but they can do some rough
> estimation of how far behind it is. What do you think? Is this good
> enough?
>
> > Also, I am still not sure about having multiple brokers reporting the
> same
> > metric. For example, if they don't report the same value (e.g. because
> one
> > broker is catching up), how does a user know which value is correct?
>
> They are all correct according to the local view. Here are two
> examples of monitors that the user can write:
>
> 1. Is there a voter that I need to remove from the quorum? They can
> create a monitor that fires, if the number-of-offline-voters metric
> has been greater than 0 for the past hour.
> 2. Is there a cluster that doesn't have 3 voters? They can create a
> monitor that fires, if any replica doesn't report three for
> number-of-voters for the past hour.
>
> Is there a specific metric that you have in mind that should only be
> reported by the KRaft leader?
>
> Thanks,
> --
> -José
>

Reply via email to