Hi All,

During the code review it came up that we shouldn't count replication bytes
together with reassignment bytes so they count to a different metrics. This
is a change in the semantics of ReplicationBytesInPerSec and
ReplicationBytesOutPerSec metrics but since we plan to separate
reassignment from replication in terms of throttling, it makes sense to
record metrics separately as well.
If there are no objections we'll proceed with this interpretation but I
wanted to send a shoutout here as well.

Also the ReassignmentMaxLag will go in a different JIRA as it requires more
discussion.

Thanks,
Viktor

On Mon, Aug 26, 2019 at 6:30 PM Jason Gustafson <ja...@confluent.io> wrote:

> Closing this vote. The final result is +9 with 4 binding votes.
>
> @Satish Sorry, I missed your question above. Good point about updating
> documentation. I will create a separate jira to make sure this gets done.
>
> -Jason
>
> On Fri, Aug 23, 2019 at 11:23 AM Jason Gustafson <ja...@confluent.io>
> wrote:
>
> > Thanks Stan, good catch. I have updated the KIP. I will plan to close the
> > vote Monday if there are no objections.
> >
> > -Jason
> >
> > On Fri, Aug 23, 2019 at 11:14 AM Colin McCabe <cmcc...@apache.org>
> wrote:
> >
> >> On Fri, Aug 23, 2019, at 11:08, Stanislav Kozlovski wrote:
> >> > Thanks for the KIP, this is very helpful
> >> >
> >> > I had an offline discussion with Jason and we discussed the semantics
> of
> >> > the underMinIsr/atMinIsr metrics. The current proposal would expose a
> >> gap
> >> > where we could report URP but no MinIsr.
> >> > A brief example:
> >> > original replica set = [0,1,2]
> >> > new replica set = [3,4,5]
> >> > isr = [0, 3, 4]
> >> > config.minIsr = 3
> >> >
> >> > As the KIP said
> >> > > In other words, we will subtract the AddingReplica from both the
> total
> >> > replicas and the current ISR when determining URP satisfaction.
> >> > We would report URP=2 (1 and 2 are not in ISR) but not underMinIsr, as
> >> we
> >> > have an ISR of 3.
> >> > Technically, any produce requests with acks=all would succeed, so it
> >> would
> >> > be false to report `underMinIsr`. We thought it'd be good to keep both
> >> > metrics consistent, so a new proposal is to use the following
> algorithm:
> >> > ```
> >> > isUrp == size(original replicas) - size(isr) > 0
> >> > ```
> >>
> >> Hi Stan,
> >>
> >> That's a good point.  Basically we should regard the size of the
> original
> >> replica set as the desired replication factor, and calculate the URPs
> based
> >> on that.  +1 for this.  (I assume Jason will update the KIP...)
> >>
> >> best,
> >> Colin
> >>
> >>
> >> >
> >> > Taking that into account, +1 from me! (non-binding)
> >> >
> >> > On Fri, Aug 23, 2019 at 7:00 PM Colin McCabe <cmcc...@apache.org>
> >> wrote:
> >> >
> >> > > +1 (binding).
> >> > >
> >> > > cheers,
> >> > > Colin
> >> > >
> >> > > On Tue, Aug 20, 2019, at 10:55, Jason Gustafson wrote:
> >> > > > Hi All,
> >> > > >
> >> > > > I'd like to start a vote on KIP-352, which is a follow-up to
> >> KIP-455 to
> >> > > > fix
> >> > > > a long-known shortcoming of URP reporting and to improve
> >> reassignment
> >> > > > monitoring:
> >> > > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-352%3A+Distinguish+URPs+caused+by+reassignment
> >> > > > .
> >> > > >
> >> > > > Note that I have added one new metric following the discussion. It
> >> seemed
> >> > > > useful to have a lag metric for reassigning partitions.
> >> > > >
> >> > > > Thanks,
> >> > > > Jason
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Best,
> >> > Stanislav
> >> >
> >>
> >
>

Reply via email to