[ 
https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15737567#comment-15737567
 ] 

Andrzej Bialecki  commented on SOLR-4735:
-----------------------------------------

There is one important issue that is still unsolved in the current patch - 
collecting per-collection metrics at the same time as we collect per-core 
metrics. A single instance of Solr can hold multiple shards and / or replicas 
that belong to the same logical collection and it would be useful to get 
combined metrics at a collection level.

If they weren't needed at the same time, we could probably set up per-core 
aliases (using {{overridableRegistryName}} or a similar mechanism) so that a 
single per-collection registry would be used for all shards/replicas. However, 
since all shards would then modify the same underlying {{Metric}} objects it 
would not be possible anymore to separate per-shard metrics from these 
aggregates. This is however the simple fallback solution of the problem - allow 
either per-core or per-collection metrics and never both.

Codahale API doesn't support aggregation of child metrics - if it were possible 
then we could fake the aggregated metrics on the fly when they are needed.

So far the only solution that comes to my mind that allows us to keep both 
levels of reporting is to extend {{Metric}} subclasses so that they delegate 
their updates to the parent instance(s), something like the following:
{code}
public class ChildCounter extends Counter {
 public ChildCounter(Counter... parents) { ... }

 public void inc(long n) {
   super.inc(n);
   for (Counter c : parents) {
     c.inc(n);
   }
 }
}
{code}
I.e. all updates to the child instances would be applied at (nearly) the same 
time to parent instances - and parent instances will be referenced by several 
child instances from different shards. For example, the {{ChildCounter}} 
instance would be registered in "solr.core.collection1_shard1" registry, and 
the aggregate counter would be registered in "solr.core.collection1" registry, 
and the same aggregate counter would be used by {{ChildCounter}} from 
"solr.core.collection1_shard2".

In order to maintain this delegation Solr components would have to always use 
{{SolrMetricRegsitry.counter(...), .timer(...), .meter(...), .histogram(...)}} 
methods that would set up this delegation, by obtaining metric instances from 
the parent registries and returning eg. {{ChildCounter}} instead of a regular 
{{Counter}}, {{ChildTimer}} instead of regular {{Timer}} etc.

This should work and it meets the criteria, but it feels clunky and 
complicated. Any other suggestions?

> Improve Solr metrics reporting
> ------------------------------
>
>                 Key: SOLR-4735
>                 URL: https://issues.apache.org/jira/browse/SOLR-4735
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>         Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, 
> SOLR-4735.patch, screenshot-1.png
>
>
> Following on from a discussion on the mailing list:
> http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+
> It would be good to make Solr play more nicely with existing devops 
> monitoring systems, such as Graphite or Ganglia.  Stats monitoring at the 
> moment is poll-only, either via JMX or through the admin stats page.  I'd 
> like to refactor things a bit to make this more pluggable.
> This patch is a start.  It adds a new interface, InstrumentedBean, which 
> extends SolrInfoMBean to return a 
> [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a 
> couple of MetricReporters (which basically just duplicate the JMX and admin 
> page reporting that's there at the moment, but which should be more 
> extensible).  The patch includes a change to RequestHandlerBase showing how 
> this could work.  The idea would be to eventually replace the getStatistics() 
> call on SolrInfoMBean with this instead.
> The next step would be to allow more MetricReporters to be defined in 
> solrconfig.xml.  The Metrics library comes with ganglia and graphite 
> reporting modules, and we can add contrib plugins for both of those.
> There's some more general cleanup that could be done around SolrInfoMBean 
> (we've got two plugin handlers at /mbeans and /plugins that basically do the 
> same thing, and the beans themselves have some weirdly inconsistent data on 
> them - getVersion() returns different things for different impls, and 
> getSource() seems pretty useless), but maybe that's for another issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to