[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693048#comment-15693048 ]
Andrzej Bialecki commented on SOLR-4735: ----------------------------------------- I'm interested in getting this issue resolved, so I'd be happy to work on committing this (I asked Alan and he doesn't mind :) ). [~cpoerschke] & Kelvin: great stuff, I really like the abstractions. I share Jeff's concern though that we need to consider how to maintain metrics that outlive any particular core instance (even core reload events :) ). Core reloads may be caused by several reasons (explicit action, config change, replication). I'm not sure under which scenario I'd prefer to reset metrics from a previous version of the core... Eventually we will want to instrument also other aspects of Solr, things that happen outside SolrCore (eg. SolrCloud operations, replication, leader metrics per replica, replica recovery stats, Jetty connections, heap, etc). For these using {{SharedMetricsRegistries}} would make more sense, so the question is whether we should use two different mechanisms for managing {{MetricRegistry}} instances, the other one being {{SolrMetricManager}}. Perhaps {{SolrMetricManager}} should use long-lived {{MetricRegistry}} instances that are managed in {{SharedMetricsRegistries}}? Also, from the point of view of monitoring the overall "load" of a particular node it would make sense to also track some really low-level Lucene stuff, such as major merges and read/write IO, but this can come later - let's first get the design right. > Improve Solr metrics reporting > ------------------------------ > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org