Re: Regarding types of DataPoint value for Metrics Consumer

Bobby Evans Wed, 18 May 2016 11:47:21 -0700

Yes it is shaded, but it dosn't have to be.  It was shaded, because it is not 
user facing, and so adding it to the classpath potentially could cause some 
issues.  If we standardize on it then we would want to stop shading it, or we 
could put in a small shim layer in front of it so we could swap it out later if 
we wanted to.  Either would be fine.  But if we change the classpath we would 
need to bump a version number of some kind. If this is for 2.0 then it is 
already bumped so no worries.  If this is for 1.x then we might want to go to 
1.1, not sure what the policy is for adding something new to the classpath.
- Bobby
  - Bobby


    On Wednesday, May 18, 2016 12:56 PM, Abhishek Agarwal 
<[email protected]> wrote:
 

 I remember from a previous discussion that codahale metrics are shaded inside 
storm-core and that breaks compatibility with any existing plugins/reporters. 
Will it not be a problem here? And btw does it need to be shaded?
@Jungteek - Exactly what are the core issues you have run into w.r.t metrics? 
At my company, we make heavy use of metrics. And there were two major issues, 
we faced    
   - explosion of metrics as the number of tasks increase - This put up a lot 
of unnecessary load on the graphite servers even though we were only interested 
in machine level aggregated metric. Aggregation is difficult to solve while 
keeping backward compatibility intact.    

   - metric tick in the same queue as message queue - If bolt is slow or 
blocked, metrics for that bolt will not be emitted since metric-tick won't be 
consumed by bolt. It can cause a lot of confusion as . [Refer STORM-972]
   - Only averages are emitted for latency in many places while histograms are 
more useful. 
I know you are trying to solve many problems with metric collection but solving 
these problems independently of each other might not be the best approach. I 
would vote for implementing a backward incompatible solution if it solves all 
these problems in clean way.
On Wed, May 18, 2016 at 9:55 PM, P. Taylor Goetz <[email protected]> wrote:

+1 for standardizing on drop wizard/Coda Hale’s metrics library. It’s a solid 
library that’s widely used and understood.

-Taylor


> On May 18, 2016, at 10:22 AM, Bobby Evans <[email protected]> wrote:
>
> There are a lot of things that I dislike about IMetric.  It provides too much 
> flexibility and at the same time not enough information/conventions to be 
> able to interpret the numbers it returns correctly.  We recently had a case 
> where someone was trying to compute an average using a ReducedMetric and a 
> MeanReducer (which by the way should be deprecated because it is 
> fundamentally flawed).  This hands the metric collector an average.  How is 
> it supposed to combine one average with another when doing a roll up, either 
> across components or across time ranges?  It just does not work 
> mathematically unless you know that all of the averages had the exact same 
> number of operations in them, which we cannot know.
> This is why dropwizard and other metrics systems have a specific set up 
> supported metrics, not object, that they know mathematically work out.  A 
> gauge is different from a counter, which is different from a ratio, or a 
> meter, or a timer, or a histogram.  Please lets not reinvent the wheel here, 
> we already did it wrong once, lets not do it wrong again.  We are using 
> dropwizard in other places in the code internally, I would prefer that we 
> standardize on it, or a thin wrapper around it based on the same concepts.  
> Or if there is a different API that someone here would prefer that we use 
> that is fine with me too.  But lets not write it ourselves, lets take from 
> the experts who have spent a long time building something that works.
>  - Bobby
>
>    On Tuesday, May 17, 2016 10:10 PM, Jungtaek Lim <[email protected]> wrote:
>
>
> Hi devs,
>
> Since IMetric#getValueAndReset doesn't restrict return type, it gives us
> flexibility but metrics consumer should parse the value without context
> (having just some assumptions).
>
> I've look into some open source metrics consumers, and many of them support
> Number, Map<String, Number/String>, and one of them supports Nested Map.
> For the case of Map its key is appended to metric key and value is
> converted to 'double'. I think it would be enough, but I'm not sure we can
> rely on all metrics consumers to handle properly, too.
>
> I feel if we can recommend proper types of DataPoint value for storing
> metrics to time-series DB via metrics consumer it would be great. It can be
> used to protocol between IMetric users and metrics consumer developers.
>
> What do you think?
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> ps. I'm not heavy user of time-series DBs (I researched some, but they
> don't document type/size of value clearly) so if someone could give the
> information of supporting type/size of value of time-series DBs it should
> be great for me. Or we can just assume number as 'double' as above and go
> forward.
>
>





-- 
Regards,
Abhishek Agarwal

Re: Regarding types of DataPoint value for Metrics Consumer

Reply via email to