Hi Jason, I too agree this is more of a problem in older versions and therefore we could backport it. Were you thinking of any specific versions? I guess the 2.x and 1.x versions are definitely targets here but I was thinking that we might not want to further.
Viktor On Mon, Apr 29, 2019 at 12:55 AM Stanislav Kozlovski <[email protected]> wrote: > Thanks for the work done, Viktor! +1 (non-binding) > > I strongly agree with Jason that this monitoring-focused KIP is worth > porting back to older versions. I am sure users will find it very useful > > Best, > Stanislav > > On Fri, Apr 26, 2019 at 9:38 PM Jason Gustafson <[email protected]> > wrote: > > > Thanks, that works for me. +1 > > > > By the way, we don't normally port KIPs to older releases, but I wonder > if > > it's worth making an exception here. From recent experience, it tends to > be > > the older versions that are more prone to fetcher failures. Thoughts? > > > > -Jason > > > > On Fri, Apr 26, 2019 at 5:18 AM Viktor Somogyi-Vass < > > [email protected]> > > wrote: > > > > > Let me have a second thought, I'll just add the clientId instead to > > follow > > > the convention, so it'll change DeadFetcherThreadCount but with the > > > clientId tag. > > > > > > On Fri, Apr 26, 2019 at 11:29 AM Viktor Somogyi-Vass < > > > [email protected]> wrote: > > > > > > > Hi Jason, > > > > > > > > Yea I think it could make sense. In this case I would rename the > > > > DeadFetcherThreadCount to DeadReplicaFetcherThreadCount and introduce > > the > > > > metric you're referring to as DeadLogDirFetcherThreadCount. > > > > I'll update the KIP to reflect this. > > > > > > > > Viktor > > > > > > > > On Thu, Apr 25, 2019 at 8:07 PM Jason Gustafson <[email protected]> > > > > wrote: > > > > > > > >> Hi Viktor, > > > >> > > > >> This looks good. Just one question I had is whether we may as well > > cover > > > >> the log dir fetchers as well. > > > >> > > > >> Thanks, > > > >> Jason > > > >> > > > >> > > > >> On Thu, Apr 25, 2019 at 7:46 AM Viktor Somogyi-Vass < > > > >> [email protected]> > > > >> wrote: > > > >> > > > >> > Hi Folks, > > > >> > > > > >> > This thread sunk a bit but I'd like to bump it hoping to get some > > > >> feedback > > > >> > and/or votes. > > > >> > > > > >> > Thanks, > > > >> > Viktor > > > >> > > > > >> > On Thu, Mar 28, 2019 at 8:47 PM Viktor Somogyi-Vass < > > > >> > [email protected]> > > > >> > wrote: > > > >> > > > > >> > > Sorry, the end of the message cut off. > > > >> > > > > > >> > > So I tried to be consistent with the convention in LogManager, > > hence > > > >> the > > > >> > > hyphens and in AbstractFetcherManager, hence the camel case. It > > > would > > > >> be > > > >> > > nice though to decide with one convention across the whole > > project, > > > >> > however > > > >> > > it requires a major refactor (especially for the components that > > > >> leverage > > > >> > > metrics for monitoring). > > > >> > > > > > >> > > Thanks, > > > >> > > Viktor > > > >> > > > > > >> > > On Thu, Mar 28, 2019 at 8:44 PM Viktor Somogyi-Vass < > > > >> > > [email protected]> wrote: > > > >> > > > > > >> > >> Hi Dhruvil, > > > >> > >> > > > >> > >> Thanks for the feedback and the vote. I fixed the typo in the > > KIP. > > > >> > >> The naming is interesting though. Unfortunately kafka overall > is > > > not > > > >> > >> consistent in metric naming but at least I tried to be > consistent > > > >> among > > > >> > the > > > >> > >> other metrics used in LogManager > > > >> > >> > > > >> > >> On Thu, Mar 28, 2019 at 7:32 PM Dhruvil Shah < > > [email protected] > > > > > > > >> > >> wrote: > > > >> > >> > > > >> > >>> Thanks for the KIP, Viktor! This is a useful addition. +1 > > overall. > > > >> > >>> > > > >> > >>> Minor nits: > > > >> > >>> > I propose to add three gauge: DeadFetcherThreadCount for the > > > >> fetcher > > > >> > >>> threads, log-cleaner-dead-thread-count for the log cleaner. > > > >> > >>> I think you meant two instead of three. > > > >> > >>> > > > >> > >>> Also, would it make sense to name these metrics consistency, > > > >> something > > > >> > >>> like > > > >> > >>> `log-cleaner-dead-thread-count` and > > > >> > `replica-fetcher-dead-thread-count`? > > > >> > >>> > > > >> > >>> Thanks, > > > >> > >>> Dhruvil > > > >> > >>> > > > >> > >>> On Thu, Mar 28, 2019 at 11:27 AM Viktor Somogyi-Vass < > > > >> > >>> [email protected]> wrote: > > > >> > >>> > > > >> > >>> > Hi All, > > > >> > >>> > > > > >> > >>> > I'd like to start a vote on KIP-434. > > > >> > >>> > This basically would add a metrics to count dead threads in > > > >> > >>> > ReplicaFetcherManager and LogCleaner to allow monitoring > > systems > > > >> to > > > >> > >>> alert > > > >> > >>> > based on this. > > > >> > >>> > > > > >> > >>> > The KIP link: > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-434%3A+Add+Replica+Fetcher+and+Log+Cleaner+Count+Metrics > > > >> > >>> > The > > > >> > >>> > PR: https://github.com/apache/kafka/pull/6514 > > > >> > >>> > > > > >> > >>> > I'd be happy to receive any votes or additional > > feedback/reviews > > > >> too. > > > >> > >>> > > > > >> > >>> > Thanks, > > > >> > >>> > Viktor > > > >> > >>> > > > > >> > >>> > > > >> > >> > > > >> > > > > >> > > > > > > > > > > > > -- > Best, > Stanislav >
