Hi Ismael,
By Jason's suggestion we finally went with the originally voted proposal
that is to include reassignment bytes in/out in replication bytes in/out
and we discuss this when throttling is changed. Sorry for not updating this
thread, I was busy with the code review. I'll get the KIP in shape to
reflect the final version.

Thanks,
Viktor

On Wed, Oct 16, 2019 at 10:06 PM Ismael Juma <isma...@gmail.com> wrote:

> The current proposal says that replication throughput would change not to
> include reassignment though.
>
> Ismael
>
> On Wed, Oct 16, 2019, 11:53 AM Colin McCabe <cmcc...@apache.org> wrote:
>
> > Hi Ismael,
> >
> > I think every replica is doing replication, by definition.  But not every
> > replica is undergoing reassignment.
> >
> > If the broker that died was in the set of new replicas being added, its
> > death will not add a new under-replicated partition.  Otherwise, it will
> > add a new URP.
> >
> > best,
> > Colin
> >
> >
> > On Wed, Oct 16, 2019, at 11:26, Ismael Juma wrote:
> > > If a broker dies and loses the disk, is it replication or reassignment?
> > >
> > > Ismael
> > >
> > > On Thu, Oct 10, 2019 at 3:30 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > During the code review it came up that we shouldn't count replication
> > bytes
> > > > together with reassignment bytes so they count to a different
> metrics.
> > This
> > > > is a change in the semantics of ReplicationBytesInPerSec and
> > > > ReplicationBytesOutPerSec metrics but since we plan to separate
> > > > reassignment from replication in terms of throttling, it makes sense
> to
> > > > record metrics separately as well.
> > > > If there are no objections we'll proceed with this interpretation
> but I
> > > > wanted to send a shoutout here as well.
> > > >
> > > > Also the ReassignmentMaxLag will go in a different JIRA as it
> requires
> > more
> > > > discussion.
> > > >
> > > > Thanks,
> > > > Viktor
> > > >
> > > > On Mon, Aug 26, 2019 at 6:30 PM Jason Gustafson <ja...@confluent.io>
> > > > wrote:
> > > >
> > > > > Closing this vote. The final result is +9 with 4 binding votes.
> > > > >
> > > > > @Satish Sorry, I missed your question above. Good point about
> > updating
> > > > > documentation. I will create a separate jira to make sure this gets
> > done.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Fri, Aug 23, 2019 at 11:23 AM Jason Gustafson <
> ja...@confluent.io
> > >
> > > > > wrote:
> > > > >
> > > > > > Thanks Stan, good catch. I have updated the KIP. I will plan to
> > close
> > > > the
> > > > > > vote Monday if there are no objections.
> > > > > >
> > > > > > -Jason
> > > > > >
> > > > > > On Fri, Aug 23, 2019 at 11:14 AM Colin McCabe <
> cmcc...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > >> On Fri, Aug 23, 2019, at 11:08, Stanislav Kozlovski wrote:
> > > > > >> > Thanks for the KIP, this is very helpful
> > > > > >> >
> > > > > >> > I had an offline discussion with Jason and we discussed the
> > > > semantics
> > > > > of
> > > > > >> > the underMinIsr/atMinIsr metrics. The current proposal would
> > expose
> > > > a
> > > > > >> gap
> > > > > >> > where we could report URP but no MinIsr.
> > > > > >> > A brief example:
> > > > > >> > original replica set = [0,1,2]
> > > > > >> > new replica set = [3,4,5]
> > > > > >> > isr = [0, 3, 4]
> > > > > >> > config.minIsr = 3
> > > > > >> >
> > > > > >> > As the KIP said
> > > > > >> > > In other words, we will subtract the AddingReplica from both
> > the
> > > > > total
> > > > > >> > replicas and the current ISR when determining URP
> satisfaction.
> > > > > >> > We would report URP=2 (1 and 2 are not in ISR) but not
> > underMinIsr,
> > > > as
> > > > > >> we
> > > > > >> > have an ISR of 3.
> > > > > >> > Technically, any produce requests with acks=all would succeed,
> > so it
> > > > > >> would
> > > > > >> > be false to report `underMinIsr`. We thought it'd be good to
> > keep
> > > > both
> > > > > >> > metrics consistent, so a new proposal is to use the following
> > > > > algorithm:
> > > > > >> > ```
> > > > > >> > isUrp == size(original replicas) - size(isr) > 0
> > > > > >> > ```
> > > > > >>
> > > > > >> Hi Stan,
> > > > > >>
> > > > > >> That's a good point.  Basically we should regard the size of the
> > > > > original
> > > > > >> replica set as the desired replication factor, and calculate the
> > URPs
> > > > > based
> > > > > >> on that.  +1 for this.  (I assume Jason will update the KIP...)
> > > > > >>
> > > > > >> best,
> > > > > >> Colin
> > > > > >>
> > > > > >>
> > > > > >> >
> > > > > >> > Taking that into account, +1 from me! (non-binding)
> > > > > >> >
> > > > > >> > On Fri, Aug 23, 2019 at 7:00 PM Colin McCabe <
> > cmcc...@apache.org>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > > +1 (binding).
> > > > > >> > >
> > > > > >> > > cheers,
> > > > > >> > > Colin
> > > > > >> > >
> > > > > >> > > On Tue, Aug 20, 2019, at 10:55, Jason Gustafson wrote:
> > > > > >> > > > Hi All,
> > > > > >> > > >
> > > > > >> > > > I'd like to start a vote on KIP-352, which is a follow-up
> to
> > > > > >> KIP-455 to
> > > > > >> > > > fix
> > > > > >> > > > a long-known shortcoming of URP reporting and to improve
> > > > > >> reassignment
> > > > > >> > > > monitoring:
> > > > > >> > > >
> > > > > >> > >
> > > > > >>
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-352%3A+Distinguish+URPs+caused+by+reassignment
> > > > > >> > > > .
> > > > > >> > > >
> > > > > >> > > > Note that I have added one new metric following the
> > discussion.
> > > > It
> > > > > >> seemed
> > > > > >> > > > useful to have a lag metric for reassigning partitions.
> > > > > >> > > >
> > > > > >> > > > Thanks,
> > > > > >> > > > Jason
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > Best,
> > > > > >> > Stanislav
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to