I remember from a previous discussion that codahale metrics are shaded inside storm-core and that breaks compatibility with any existing plugins/reporters. Will it not be a problem here? And btw does it need to be shaded?
@Jungteek - Exactly what are the core issues you have run into w.r.t metrics? At my company, we make heavy use of metrics. And there were two major issues, we faced - explosion of metrics as the number of tasks increase - This put up a lot of unnecessary load on the graphite servers even though we were only interested in machine level aggregated metric. Aggregation is difficult to solve while keeping backward compatibility intact. - metric tick in the same queue as message queue - If bolt is slow or blocked, metrics for that bolt will not be emitted since metric-tick won't be consumed by bolt. It can cause a lot of confusion as . [Refer STORM-972] - Only averages are emitted for latency in many places while histograms are more useful. I know you are trying to solve many problems with metric collection but solving these problems independently of each other might not be the best approach. I would vote for implementing a backward incompatible solution if it solves all these problems in clean way. On Wed, May 18, 2016 at 9:55 PM, P. Taylor Goetz <[email protected]> wrote: > +1 for standardizing on drop wizard/Coda Hale’s metrics library. It’s a > solid library that’s widely used and understood. > > -Taylor > > > > On May 18, 2016, at 10:22 AM, Bobby Evans <[email protected]> > wrote: > > > > There are a lot of things that I dislike about IMetric. It provides too > much flexibility and at the same time not enough information/conventions to > be able to interpret the numbers it returns correctly. We recently had a > case where someone was trying to compute an average using a ReducedMetric > and a MeanReducer (which by the way should be deprecated because it is > fundamentally flawed). This hands the metric collector an average. How is > it supposed to combine one average with another when doing a roll up, > either across components or across time ranges? It just does not work > mathematically unless you know that all of the averages had the exact same > number of operations in them, which we cannot know. > > This is why dropwizard and other metrics systems have a specific set up > supported metrics, not object, that they know mathematically work out. A > gauge is different from a counter, which is different from a ratio, or a > meter, or a timer, or a histogram. Please lets not reinvent the wheel > here, we already did it wrong once, lets not do it wrong again. We are > using dropwizard in other places in the code internally, I would prefer > that we standardize on it, or a thin wrapper around it based on the same > concepts. Or if there is a different API that someone here would prefer > that we use that is fine with me too. But lets not write it ourselves, > lets take from the experts who have spent a long time building something > that works. > > - Bobby > > > > On Tuesday, May 17, 2016 10:10 PM, Jungtaek Lim <[email protected]> > wrote: > > > > > > Hi devs, > > > > Since IMetric#getValueAndReset doesn't restrict return type, it gives us > > flexibility but metrics consumer should parse the value without context > > (having just some assumptions). > > > > I've look into some open source metrics consumers, and many of them > support > > Number, Map<String, Number/String>, and one of them supports Nested Map. > > For the case of Map its key is appended to metric key and value is > > converted to 'double'. I think it would be enough, but I'm not sure we > can > > rely on all metrics consumers to handle properly, too. > > > > I feel if we can recommend proper types of DataPoint value for storing > > metrics to time-series DB via metrics consumer it would be great. It can > be > > used to protocol between IMetric users and metrics consumer developers. > > > > What do you think? > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > ps. I'm not heavy user of time-series DBs (I researched some, but they > > don't document type/size of value clearly) so if someone could give the > > information of supporting type/size of value of time-series DBs it should > > be great for me. Or we can just assume number as 'double' as above and go > > forward. > > > > > > -- Regards, Abhishek Agarwal
