Response on this important issue is pretty good. I am happily surprised :) I want to mention our strategy for extracting metrics from other products. We use jolokia_proxy (https://jolokia.org/features/proxy.html) to get JMX beans from several softwares and feed them to telegraf. That way, we avoid writing custom processors for all these different products.
Telegraf is quickly becoming a standard for metrics data. (Just see the list of input plugins here: https://github.com/influxdata/telegraf/tree/master/plugins/inputs). And it integrates well with several outputs too ( https://github.com/influxdata/telegraf/tree/master/plugins/outputs). Also since the metrics in JMX, they can be queried by jolokia-agent installed per node. This avoids the extra metrics-consumer bolt which can hit the topology throughtput too. So I cast my vote in favor of JMX-implementation of metrics. Other approaches are welcome to be discussed. On Tue, Oct 11, 2016 at 7:30 PM, Alessandro Bellina < [email protected]> wrote: > blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px > #715FFA solid !important; padding-left:1ex !important; > background-color:white !important; } Yeap that's a requirement from our > perspective (working through this list). > Sure I think as usual we can start with master with an eye for what would > need to be back ported. > Sent from Yahoo Mail for iPhone > > > On Tuesday, October 11, 2016, 8:50 PM, P. Taylor Goetz <[email protected]> > wrote: > > I hope I didn't come across as overly critical. You did the best with what > you had to work with. Which isn't pretty. > > We could potentially do a parallel metrics API in 1.1, 1.2, or master and > still stay close to semantic versioning...? > > -Taylor > > > On Oct 11, 2016, at 9:28 PM, Jungtaek Lim <[email protected]> wrote: > > > > Yeah I admit that configuration flag was bad for me also, but I have no > > alternatives. Only way to avoid struggling with design limitation is > revamp > > / redesign. > > Thanks S G for exposing willingness of volunteer and great news > Alessandro > > for that project. > > Alessandro, could you forward the upcoming news for the project to dev@ > > list? > > > > - Jungtaek Lim (HeartSaVioR) > > > > 2016년 10월 12일 (수) 오전 10:22, P. Taylor Goetz <[email protected]>님이 작성: > > > >> I was thinking on a smaller scale in terms of effort, but the more I > think > >> about it, the more supportive I would be of a full revamp (new API) for > >> metrics based on Coda Hale's metrics library. It's proven and stable. > I've > >> used it many times. I think either approach would be roughly the same > >> amount of work. > >> > >> Some of the metrics API improvements in the 1.1.x branch are nice, but > >> IMHO are lipstick on a pig. > >> > >> With apologies to Jungtaek, who has done amazing work all across the > >> codebase, I'm a little squeamish about the proposed change to metrics > that > >> changes the consumer API based on a configuration flag (don't know the > PR > >> number offhand). > >> > >> I'm +1 for moving in this direction (revamped metrics). Let's end the > >> metrics pain. > >> > >> -Taylor > >> > >>> On Oct 11, 2016, at 10:15 AM, Bobby Evans <[email protected] > > > >> wrote: > >>> > >>> I agree that IMetricsConsumer is not good, but the reality is that all > >> of the metrics system needs to be redone. The problem is that we ship > an > >> object as a metric. If I get an object I have no idea what it is hand > >> hence no idea how to report it or what to do with it. What is more the > >> common types we use in the metrics we provide are really not enough. > For > >> example CountMetric sends a Long. Well when I get it in the metrics > >> consumer I have no idea if I should report it like a counter or if I > should > >> report it like a gauge (something that every metrics system I have used > >> wants to know). But then we support pre-aggregation of the metrics with > >> IReducer so the number I get might be an average instead of either a > gauge > >> or a counter, which no good metrics system will want to collect because > I > >> cannot aggregate it with anything else, the math just does not work. > >>> The proposal I have said before and I still believe is that we need to > >> put in place a parallel metrics API/system. We will deprecate all of > >> https://git.corp.yahoo.com/storm/storm/tree/master- > security/storm-core/src/jvm/backtype/storm/metric/api > >> and create a new parallel one that provides an API similar to > >> http://metrics.dropwizard.io/3.1.0/. I would even be fine in just > using > >> their API and exposing that to end users. Dropwizard has solved all of > >> these problems already and I don't see a reason to reinvent the wheel. > I > >> don't personally see a lot of value in trying to send all of the metrics > >> through storm itslef. I am fine if we are able to support that, but it > is > >> far from a requirement. - Bobby > >>> > >>> On Monday, October 10, 2016 10:47 PM, S G <[email protected]> > >> wrote: > >>> > >>> > >>> +1 > >>> We can probably start by opening a JIRA for this and adding a design > >>> approach for the same? > >>> I would like to help in the coding-effort for this. > >>> > >>> -SG > >>> > >>>> On Mon, Oct 10, 2016 at 1:51 PM, P. Taylor Goetz <[email protected]> > >> wrote: > >>>> > >>>> I’ve been thinking about metrics lately, specifically the fact that > >> people > >>>> tend to struggle with implementing a metrics consumer. (Like this one > >> [1]). > >>>> > >>>> The IMetricsConsumer interface is pretty low level, and common > >>>> aggregations, calculations, etc. are left up to each individual > >>>> implementation. That seems like an area where further abstraction > would > >>>> make it easier to support different back ends (Graphite, JMX, Splunk, > >> etc.). > >>>> > >>>> My thought is to create an abstract IMetricsConsumer implementation > that > >>>> does common aggregations and calculations, and then delegates to a > >> plugable > >>>> “metrics sink” implementation (e.g. “IMetricsSink”, etc.). That would > >>>> greatly simplify the effort required to integrate with various > external > >>>> metrics systems. I know of at least a few users that would be > >> interested, > >>>> one is currently scraping the logs from LoggingMetricsConsumer and > >> polling > >>>> the Storm REST API for their metrics. > >>>> > >>>> -Taylor > >>>> > >>>> [1] http://twocentsonsoftware.blogspot.co.il/2014/12/ > >>>> sending-out-storm-metrics.html > >>>> > >>>> > >>>>> On Oct 10, 2016, at 12:14 PM, Bobby Evans > <[email protected] > >>> > >>>> wrote: > >>>>> > >>>>> First of all the server exposes essentially the same interface that > the > >>>> IMetricsConsumer exposes. It mostly just adds a bunch of overhead in > >> the > >>>> middle to serialize out the objects send them over http to another > >> process > >>>> which then has to deserialize them and process them. If you really > >> don't > >>>> need the metrics to show up on a special known box you can have that > >> exact > >>>> same code running inside the metrics consumer without all of the > >> overhead. > >>>>> The server/client are insecure, have to deal with thread issues that > a > >>>> normal IMetricsConsumer does not, and are not written to be robust (If > >> the > >>>> HTTP server is down the consumer crashes and continues to crash until > >> the > >>>> server is brought back up). It was written very quickly for a test > >>>> situation and it honestly never crossed my mind that anyone would want > >> to > >>>> use it in production. > >>>>> > >>>>> - Bobby > >>>>> > >>>>> On Monday, October 10, 2016 10:59 AM, S G < > >> [email protected]> > >>>> wrote: > >>>>> > >>>>> > >>>>> Thanks Bobby. > >>>>> > >>>>> If we write our own metrics consumer, how do we ensure that it is > >> better > >>>>> than HttpForwardingMetricsServer? In other words, what aspects of the > >>>>> HttpForwardingMetricsServer > >>>>> should we avoid to make our own metrics consumer better and ready for > >>>>> production? > >>>>> > >>>>> Is versign/storm-graphite <https://github.com/verisign/ > storm-graphite> > >>>>> production > >>>>> ready? > >>>>> > >>>>> Also, we should add a line about production-readiness of > >>>>> HttpForwardingMetricsServer > >>>>> in the documentation at http://storm.apache.org/ > >>>> releases/1.0.2/Metrics.html > >>>>> (We were just about to think seriously on using this for production > as > >> we > >>>>> thought this to be the standard solution for metrics' consumption in > >> 1.0+ > >>>>> version). > >>>>> > >>>>> -SG > >>>>> > >>>>> On Mon, Oct 10, 2016 at 6:37 AM, Bobby Evans <[email protected]> > >>>> wrote: > >>>>> > >>>>>> First of all there really are two different sets of metrics. One > set > >> is > >>>>>> the topology metrics and the other set is the daemon metrics > (metrics > >>>> for > >>>>>> things like the ui and nimbus). The JmxPreparableReporter plugin > only > >>>>>> exposes daemon metrics not the topology metrics through JMX. > Exposing > >>>>>> topology metrics through JMX is a non trivial task. The current > >> metrics > >>>>>> feature was not designed for this. We are in the process of trying > to > >>>>>> redesign the metrics system to allow for features like this, but it > is > >>>>>> still a ways off. > >>>>>> > >>>>>> - Bobby > >>>>>> > >>>>>> > >>>>>> On Saturday, October 8, 2016 11:39 AM, S G < > [email protected] > >>> > >>>>>> wrote: > >>>>>> > >>>>>> > >>>>>> Thanks Bobby, > >>>>>> > >>>>>> We will need some kind of IMetricsConsumer to talk to telegraf. > >>>>>> Many other softwares like Solr, Elastic-Search, Cassandra etc. > provide > >>>>>> metrics through a URL making it very easy to consume by tools like > >>>> telegraf. > >>>>>> How about a IMetricsConsumer that will run on storm-ui and provide > the > >>>>>> metrics through a URL such as <storm-ui-host>/metrics ? > >>>>>> > >>>>>> Also, I see the following option in defaults.yaml: > >>>>>> #default storm daemon metrics reporter plugins > >>>>>> storm.daemon.metrics.reporter.plugins: > >>>>>> - "org.apache.storm.daemon.metrics.reporters. > >>>> JmxPreparableReporter" > >>>>>> > >>>>>> Is this a good option to use for converting metrics into JMX ? > >>>>>> > >>>>>> Thanks > >>>>>> SG > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Fri, Oct 7, 2016 at 8:11 AM, Bobby Evans > >> <[email protected] > >>>>> > >>>>>> wrote: > >>>>>> > >>>>>> HttpForwardingMetricsServer is a real hack intended really for > >> tests. I > >>>>>> know I wrote it :). Please don't use it in production. You can > write > >>>> your > >>>>>> own IMetricesConsumer to do whatever you want to with the metrics. > >>>>>> https://github.com/apache/ storm/blob/master/storm-core/ > >>>>>> src/jvm/org/apache/storm/ metric/api/IMetricsConsumer. java > >>>>>> <https://github.com/apache/storm/blob/master/storm-core/ > >>>> src/jvm/org/apache/storm/metric/api/IMetricsConsumer.java> > >>>>>> > >>>>>> That is the correct way to get the data out. If you want to write a > >>>>>> bridge to JMX for this that might work, but going directly to > >> telegraph > >>>>>> would probably be better. - Bobby > >>>>>> > >>>>>> On Thursday, October 6, 2016 1:43 PM, S G < > >>>> [email protected]> > >>>>>> wrote: > >>>>>> > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> We want to use Telegraf ( > >>>>>> https://github.com/influxdata/ telegraf/tree/master/plugins > >>>>>> <https://github.com/influxdata/telegraf/tree/master/plugins>) for > >>>> getting > >>>>>> storm's metrics. > >>>>>> > >>>>>> But we do not want to add a HttpForwardingMetricsServer just to get > >> the > >>>>>> metrics and send them to telegraf. > >>>>>> > >>>>>> Other option is to use Jolokia (https://jolokia.org/) that can read > >> JMX > >>>>>> and > >>>>>> write into telegraf. > >>>>>> > >>>>>> Does storm report all its metrics (including those of custom > >>>> spouts/bolts) > >>>>>> into JMX? > >>>>>> Or spawning a HttpForwardingMetricsServer is the only option? > >>>>>> > >>>>>> Thanks > >>>>>> SG > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >> > > >
