[ 
https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693048#comment-15693048
 ] 

Andrzej Bialecki  commented on SOLR-4735:
-----------------------------------------

I'm interested in getting this issue resolved, so I'd be happy to work on 
committing this (I asked Alan and he doesn't mind :) ).

[~cpoerschke] & Kelvin: great stuff, I really like the abstractions. I share 
Jeff's concern though that we need to consider how to maintain metrics that 
outlive any particular core instance (even core reload events :) ). Core 
reloads may be caused by several reasons (explicit action, config change, 
replication). I'm not sure under which scenario I'd prefer to reset metrics 
from a previous version of the core...

Eventually we will want to instrument also other aspects of Solr, things that 
happen outside SolrCore (eg. SolrCloud operations, replication, leader metrics 
per replica, replica recovery stats, Jetty connections, heap, etc). For these 
using {{SharedMetricsRegistries}} would make more sense, so the question is 
whether we should use two different mechanisms for managing {{MetricRegistry}} 
instances, the other one being {{SolrMetricManager}}. Perhaps 
{{SolrMetricManager}} should use long-lived {{MetricRegistry}} instances that 
are managed in {{SharedMetricsRegistries}}?

Also, from the point of view of monitoring the overall "load" of a particular 
node it would make sense to also track some really low-level Lucene stuff, such 
as major merges and read/write IO, but this can come later - let's first get 
the design right.

> Improve Solr metrics reporting
> ------------------------------
>
>                 Key: SOLR-4735
>                 URL: https://issues.apache.org/jira/browse/SOLR-4735
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Minor
>         Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, 
> SOLR-4735.patch
>
>
> Following on from a discussion on the mailing list:
> http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+
> It would be good to make Solr play more nicely with existing devops 
> monitoring systems, such as Graphite or Ganglia.  Stats monitoring at the 
> moment is poll-only, either via JMX or through the admin stats page.  I'd 
> like to refactor things a bit to make this more pluggable.
> This patch is a start.  It adds a new interface, InstrumentedBean, which 
> extends SolrInfoMBean to return a 
> [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a 
> couple of MetricReporters (which basically just duplicate the JMX and admin 
> page reporting that's there at the moment, but which should be more 
> extensible).  The patch includes a change to RequestHandlerBase showing how 
> this could work.  The idea would be to eventually replace the getStatistics() 
> call on SolrInfoMBean with this instead.
> The next step would be to allow more MetricReporters to be defined in 
> solrconfig.xml.  The Metrics library comes with ganglia and graphite 
> reporting modules, and we can add contrib plugins for both of those.
> There's some more general cleanup that could be done around SolrInfoMBean 
> (we've got two plugin handlers at /mbeans and /plugins that basically do the 
> same thing, and the beans themselves have some weirdly inconsistent data on 
> them - getVersion() returns different things for different impls, and 
> getSource() seems pretty useless), but maybe that's for another issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to