On Thu, Jun 8, 2023, at 10:00, David Arthur wrote: > Colin, thanks for the KIP! These all seem like pretty useful additions. A > few quick questions > > 1) Will the value of TimedOutBrokerHeartbeatCount be zero for inactive > controllers?
No. It will just stay at whatever it was when the controller became inactive (or 0 if it was never active). > Will the value reset to zero after an election, or only > process restart? Only process restart. The reason is that if the metric resets on a new controller being elected, a lot of metrics collection systems will potentially miss it. After all, one pattern we've seen is excessive load leading to excessive controller elections. > 2) Does HandleLoadSnapshotCount include the initial load? It does count the initial load. > I.e., will it always be at least 1 Well, there isn't an initial load, if the cluster is brand new. :) But yes, in general a value of 1 is expected. best, Colin > > Thanks! > David > > On Wed, Jun 7, 2023 at 11:17 PM Luke Chen <show...@gmail.com> wrote: > >> Hi Colin, >> >> Thanks for the response. >> I have no more comments. >> +1 (binding) >> >> Luke >> >> On Thu, Jun 8, 2023 at 6:02 AM Colin McCabe <cmcc...@apache.org> wrote: >> > >> > Hi Luke, >> > >> > Thanks for the review and the suggestion. >> > >> > I think we will add more "handling time" metrics later, but for now I >> don't want to make this KIP any bigger than it is already... >> > >> > best, >> > Colin >> > >> > >> > On Wed, Jun 7, 2023, at 03:12, Luke Chen wrote: >> > > Hi Colin, >> > > >> > > One comment: >> > > Should we add a metric to record the snapshot handling time? >> > > Since we know the snapshot loading might take long if the size is huge. >> > > We might want to know how much time it is processed. WDYT? >> > > >> > > No matter you think we need it or not, the KIP LGTM. >> > > +1 from me. >> > > >> > > >> > > Thank you. >> > > Luke >> > > >> > > On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe <cmcc...@apache.org> >> wrote: >> > >> >> > >> Hi all, >> > >> >> > >> I added two new metrics to the list: >> > >> >> > >> * LatestSnapshotGeneratedBytes >> > >> * LatestSnapshotGeneratedAgeMs >> > >> >> > >> These will help monitor the period snapshot generation process. >> > >> >> > >> best, >> > >> Colin >> > >> >> > >> >> > >> On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote: >> > >> > Hi Divij, >> > >> > >> > >> > Yes, I am referring to the feature level. I changed the description >> of >> > >> > CurrentMetadataVersion to reference the feature level specifically. >> > >> > >> > >> > best, >> > >> > Colin >> > >> > >> > >> > >> > >> > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote: >> > >> >> "Each metadata version has a corresponding integer in the >> > >> >> MetadataVersion.java file." >> > >> >> >> > >> >> Please correct me if I'm wrong, but are you referring to >> "featureLevel" >> > >> >> in >> > >> >> the enum at >> > >> >> >> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45 >> > >> >> ? Is yes, can we please update the description of the metric to >> make it >> > >> >> easier for the users to understand this? For example, we can say, >> > >> >> "Represents the current metadata version as an integer value. See >> > >> >> MetadataVersion (hyperlink) for a mapping between string and >> integer >> > >> >> formats of metadata version". >> > >> >> >> > >> >> -- >> > >> >> Divij Vaidya >> > >> >> >> > >> >> >> > >> >> >> > >> >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino <rndg...@gmail.com> >> wrote: >> > >> >> >> > >> >>> Thanks again for the KIP, Colin. +1 (binding). >> > >> >>> >> > >> >>> Ron >> > >> >>> >> > >> >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez >> <soa...@apple.com.invalid> >> > >> >>> wrote: >> > >> >>> > >> > >> >>> > Thanks for the KIP. >> > >> >>> > >> > >> >>> > Seems straightforward, LGTM. >> > >> >>> > Non binding +1. >> > >> >>> > >> > >> >>> > -- >> > >> >>> > Igor >> > >> >>> > >> > >> >>> >>