I would like to volunteer for this (Have contributed to Solr, Avro and Hive previously). But I would need a little guidance initially to get started because I haven't dug too deep in storm's code-base.
-SG On Tue, Oct 11, 2016 at 3:52 PM, Jungtaek Lim <kabh...@gmail.com> wrote: > Alessandro, that was before Verisign introduced storm-graphite. In result > he > followed Storm's metric system. > > I initiated the discussion around metrics several times (mostly first half > of this year), and from many times the results were that all the metrics > interfaces are not good and we need to rework. Finding a way with resorting > current interface doesn't help. > How we renew the metrics system is the matter. I was waiting on phase of > JStorm merger so that we can evaluate JStorm's new metrics system. I guess > adopting that is enough changes but I saw that Alibaba met memory issue. > How they handle this seems not exposed. For me the thing is JStorm merger > is going really slow (nearly stopped), so going to phase 2 might not happen > on this year. > For seeking other ways, I guess Kafka and Flink renews their metrics > (KafkaStream for Kafka) so we may want to take a look. > > Anyway, someone volunteer to renew metrics system via Dropwizard or JStorm > metrics it would be awesome. I'm focused to improve Storm SQL and there's > no other active contributor on Storm SQL yet so I couldn't look at the > other side. > > - Jungtaek Lim (HeartSaVioR) > > 2016년 10월 12일 (수) 오전 3:15, Alessandro Bellina > <abell...@yahoo-inc.com.invalid>님이 작성: > > > sorry, hopefully the link goes through now: > > http://www.michael-noll.com/blog/2013/11/06/sending- > metrics-from-storm-to-graphite > > > > > > Sending Metrics from Storm to Graphite - Michael G. Noll > > By Michael G. Noll > > Sending application-level metrics from Storm topologies to the Graphite > > monitoring system > > > > > > > > On Tuesday, October 11, 2016 1:13 PM, Alessandro Bellina < > > abell...@yahoo-inc.com.INVALID> wrote: > > > > > > > > I think what Bobby is referring to is that the metrics consumer is > another > > bolt, so stats are flowing through storm. > > What does changing the model to polling buy us? I could see cases were > > we'd need more error handling for instance slow/busy workers. > > If we think that writing a new system is the way to go (say with codahale > > throughout), would working on an abstraction layer that is used by the > > daemons but also by end-users be a good place to start? With codahale as > > the implementation? > > Looks like Michael Noll has done a lot work with codahale, for instance: > > Sending Metrics from Storm to Graphite - Michael G. Noll. > > > > | > > | > > | > > | | | > > > > | > > > > | > > | > > | | > > Sending Metrics from Storm to Graphite - Michael G. Noll > > By Michael G. Noll Sending application-level metrics from Storm > topologies > > to the Graphite monitoring system | | > > > > | > > > > | > > > > > > > > Thanks, > > Alessandro > > On Tuesday, October 11, 2016 11:07 AM, S G < > sg.online.em...@gmail.com> > > wrote: > > > > > > > > "Dropwizard has solved all of these problems already and I don't see a > > reason to reinvent the wheel" - I love dropwizard too and many of the > other > > tools have switched to using the same too. > > > > "I don't personally see a lot of value in trying to send all of the > metrics > > through storm itself" - How about every node reporting its own metrics > by a > > URL ? That ways there is no need for a metrics-consumer that can > bottleneck > > the whole topology. We can then provide a separate server that can query > > all nodes to get those metrics and aggregate them. Only cluster wide > > metrics should be reported by the storm-UI's REST API (assuming there are > > not too many of those). > > > > On Tue, Oct 11, 2016 at 7:15 AM, Bobby Evans <ev...@yahoo-inc.com.invalid > > > > wrote: > > > > > I agree that IMetricsConsumer is not good, but the reality is that all > of > > > the metrics system needs to be redone. The problem is that we ship an > > > object as a metric. If I get an object I have no idea what it is hand > > > hence no idea how to report it or what to do with it. What is more the > > > common types we use in the metrics we provide are really not enough. > For > > > example CountMetric sends a Long. Well when I get it in the metrics > > > consumer I have no idea if I should report it like a counter or if I > > should > > > report it like a gauge (something that every metrics system I have used > > > wants to know). But then we support pre-aggregation of the metrics > with > > > IReducer so the number I get might be an average instead of either a > > gauge > > > or a counter, which no good metrics system will want to collect > because I > > > cannot aggregate it with anything else, the math just does not work. > > > The proposal I have said before and I still believe is that we need to > > put > > > in place a parallel metrics API/system. We will deprecate all of > > > https://git.corp.yahoo.com/storm/storm/tree/master- > > > security/storm-core/src/jvm/backtype/storm/metric/api and create a new > > > parallel one that provides an API similar to http://metrics.dropwizard > . > > > io/3.1.0/. I would even be fine in just using their API and exposing > > > that to end users. Dropwizard has solved all of these problems already > > and > > > I don't see a reason to reinvent the wheel. I don't personally see a > lot > > > of value in trying to send all of the metrics through storm itslef. I > am > > > fine if we are able to support that, but it is far from a requirement. > - > > > Bobby > > > > > > On Monday, October 10, 2016 10:47 PM, S G < > sg.online.em...@gmail.com> > > > wrote: > > > > > > > > > +1 > > > We can probably start by opening a JIRA for this and adding a design > > > approach for the same? > > > I would like to help in the coding-effort for this. > > > > > > -SG > > > > > > On Mon, Oct 10, 2016 at 1:51 PM, P. Taylor Goetz <ptgo...@gmail.com> > > > wrote: > > > > > > > I’ve been thinking about metrics lately, specifically the fact that > > > people > > > > tend to struggle with implementing a metrics consumer. (Like this one > > > [1]). > > > > > > > > The IMetricsConsumer interface is pretty low level, and common > > > > aggregations, calculations, etc. are left up to each individual > > > > implementation. That seems like an area where further abstraction > would > > > > make it easier to support different back ends (Graphite, JMX, Splunk, > > > etc.). > > > > > > > > My thought is to create an abstract IMetricsConsumer implementation > > that > > > > does common aggregations and calculations, and then delegates to a > > > plugable > > > > “metrics sink” implementation (e.g. “IMetricsSink”, etc.). That would > > > > greatly simplify the effort required to integrate with various > external > > > > metrics systems. I know of at least a few users that would be > > interested, > > > > one is currently scraping the logs from LoggingMetricsConsumer and > > > polling > > > > the Storm REST API for their metrics. > > > > > > > > -Taylor > > > > > > > > [1] http://twocentsonsoftware.blogspot.co.il/2014/12/ > > > > sending-out-storm-metrics.html > > > > > > > > > > > > > On Oct 10, 2016, at 12:14 PM, Bobby Evans > > <ev...@yahoo-inc.com.INVALID > > > > > > > > wrote: > > > > > > > > > > First of all the server exposes essentially the same interface that > > the > > > > IMetricsConsumer exposes. It mostly just adds a bunch of overhead in > > the > > > > middle to serialize out the objects send them over http to another > > > process > > > > which then has to deserialize them and process them. If you really > > don't > > > > need the metrics to show up on a special known box you can have that > > > exact > > > > same code running inside the metrics consumer without all of the > > > overhead. > > > > > The server/client are insecure, have to deal with thread issues > that > > a > > > > normal IMetricsConsumer does not, and are not written to be robust > (If > > > the > > > > HTTP server is down the consumer crashes and continues to crash until > > the > > > > server is brought back up). It was written very quickly for a test > > > > situation and it honestly never crossed my mind that anyone would > want > > to > > > > use it in production. > > > > > > > > > > - Bobby > > > > > > > > > > On Monday, October 10, 2016 10:59 AM, S G < > > > sg.online.em...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > Thanks Bobby. > > > > > > > > > > If we write our own metrics consumer, how do we ensure that it is > > > better > > > > > than HttpForwardingMetricsServer? In other words, what aspects of > the > > > > > HttpForwardingMetricsServer > > > > > should we avoid to make our own metrics consumer better and ready > for > > > > > production? > > > > > > > > > > Is versign/storm-graphite < > > https://github.com/verisign/storm-graphite> > > > > > production > > > > > ready? > > > > > > > > > > Also, we should add a line about production-readiness of > > > > > HttpForwardingMetricsServer > > > > > in the documentation at http://storm.apache.org/ > > > > releases/1.0.2/Metrics.html > > > > > (We were just about to think seriously on using this for production > > as > > > we > > > > > thought this to be the standard solution for metrics' consumption > in > > > 1.0+ > > > > > version). > > > > > > > > > > -SG > > > > > > > > > > On Mon, Oct 10, 2016 at 6:37 AM, Bobby Evans <ev...@yahoo-inc.com> > > > > wrote: > > > > > > > > > >> First of all there really are two different sets of metrics. One > > set > > > is > > > > >> the topology metrics and the other set is the daemon metrics > > (metrics > > > > for > > > > >> things like the ui and nimbus). The JmxPreparableReporter plugin > > only > > > > >> exposes daemon metrics not the topology metrics through JMX. > > Exposing > > > > >> topology metrics through JMX is a non trivial task. The current > > > metrics > > > > >> feature was not designed for this. We are in the process of > trying > > to > > > > >> redesign the metrics system to allow for features like this, but > it > > is > > > > >> still a ways off. > > > > >> > > > > >> - Bobby > > > > >> > > > > >> > > > > >> On Saturday, October 8, 2016 11:39 AM, S G < > > sg.online.em...@gmail.com > > > > > > > > >> wrote: > > > > >> > > > > >> > > > > >> Thanks Bobby, > > > > >> > > > > >> We will need some kind of IMetricsConsumer to talk to telegraf. > > > > >> Many other softwares like Solr, Elastic-Search, Cassandra etc. > > provide > > > > >> metrics through a URL making it very easy to consume by tools like > > > > telegraf. > > > > >> How about a IMetricsConsumer that will run on storm-ui and provide > > the > > > > >> metrics through a URL such as <storm-ui-host>/metrics ? > > > > >> > > > > >> Also, I see the following option in defaults.yaml: > > > > >> #default storm daemon metrics reporter plugins > > > > >> storm.daemon.metrics.reporter.plugins: > > > > >> - "org.apache.storm.daemon.metrics.reporters. > > > > JmxPreparableReporter" > > > > >> > > > > >> Is this a good option to use for converting metrics into JMX ? > > > > >> > > > > >> Thanks > > > > >> SG > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Fri, Oct 7, 2016 at 8:11 AM, Bobby Evans > > > <ev...@yahoo-inc.com.invalid > > > > > > > > > >> wrote: > > > > >> > > > > >> HttpForwardingMetricsServer is a real hack intended really for > > > tests. I > > > > >> know I wrote it :). Please don't use it in production. You can > > write > > > > your > > > > >> own IMetricesConsumer to do whatever you want to with the metrics. > > > > >> https://github.com/apache/ storm/blob/master/storm-core/ > > > > >> src/jvm/org/apache/storm/ metric/api/IMetricsConsumer. java > > > > >> <https://github.com/apache/storm/blob/master/storm-core/ > > > > src/jvm/org/apache/storm/metric/api/IMetricsConsumer.java> > > > > >> > > > > >> That is the correct way to get the data out. If you want to > write a > > > > >> bridge to JMX for this that might work, but going directly to > > > telegraph > > > > >> would probably be better. - Bobby > > > > >> > > > > >> On Thursday, October 6, 2016 1:43 PM, S G < > > > > sg.online.em...@gmail.com> > > > > >> wrote: > > > > >> > > > > >> > > > > >> Hi, > > > > >> > > > > >> We want to use Telegraf ( > > > > >> https://github.com/influxdata/ telegraf/tree/master/plugins > > > > >> <https://github.com/influxdata/telegraf/tree/master/plugins>) for > > > > getting > > > > >> storm's metrics. > > > > >> > > > > >> But we do not want to add a HttpForwardingMetricsServer just to > get > > > the > > > > >> metrics and send them to telegraf. > > > > >> > > > > >> Other option is to use Jolokia (https://jolokia.org/) that can > read > > > JMX > > > > >> and > > > > >> write into telegraf. > > > > >> > > > > >> Does storm report all its metrics (including those of custom > > > > spouts/bolts) > > > > >> into JMX? > > > > >> Or spawning a HttpForwardingMetricsServer is the only option? > > > > >> > > > > >> Thanks > > > > >> SG > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >