Re: Collection API for performance monitoring?

Tomás Fernández Löbbe Tue, 15 Nov 2016 09:54:53 -0800

If you only need query/update performance you could aggregate the logs too.
If you need more information, I like what was proposed in SOLR-9641, that
would allow you do collect and aggregate metrics for internal components
too.


Tomás

On Tue, Nov 15, 2016 at 8:31 AM, Walter Underwood <wun...@wunderwood.org>
wrote:

> To calculate percentiles we need all the data points. If there is a lot of
> data, it could be sampled.
>
> Average can be calculated with the total time and the number of requests.
> Snapshots of those
> two values allow snapshots of averages.
>
> But averages are the wrong metric for a one-sided distribution like
> response time. Let’s assume
> that any response longer than 10 seconds is a bad experience. Percentiles
> will tell you what
> response time 95% of customer searches are getting. With averages, a
> single 30 second response
> time will increase the metric, even though it is “just as broken” as a 15
> s response.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> On Nov 15, 2016, at 7:27 AM, Ryan Josal <rjo...@gmail.com> wrote:
>
> I haven't tried for 95th percentile, but generally with those collection
> start stats you would monitor based on calculated deltas.  You can figure
> out the average response time for any given window of time not smaller than
> your snapshot polling interval.  I don't see why 95th percentile would be
> any different.
>
> Ryan
>
> On Monday, November 14, 2016, Walter Underwood <wun...@wunderwood.org>
> wrote:
>
>> Because the current stats are not usable. They really should be removed
>> from the code.
>>
>> They calculate percentiles since the last collection load. We need to
>> know 95th percentile
>> during the peak hour last night, not the 95th for the last month.
>>
>> Right now, we run eleven collections in our Solr 4 cluster. In each
>> collection, we have
>> several different handlers. Usually, one for autosuggest (instant
>> results), one for the SRP,
>> and one for mobile, though we also have SEO requests and so on. We can
>> track performance
>> for each of these.
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> On Nov 14, 2016, at 3:54 PM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>
>> Point taken, and thanks for the link. The stats I'm referring to in
>> this thread are available now, and would (I think) be a quick win. I
>> don't have a huge amount of investment in it though, more "why didn't
>> we think of this before?" followed by "maybe there's a very good
>> reason not to bother". This may be it since we now standardize on
>> Jetty. My question of course is whether this would be supported moving
>> forward to netty or whatever...
>>
>> Best,
>> Erick
>>
>> On Mon, Nov 14, 2016 at 3:44 PM, Walter Underwood <wun...@wunderwood.org>
>> wrote:
>>
>> I’m not fond of polling for performance stats. I’d rather have the app
>> report them.
>>
>> We could integrate existing Jetty monitoring:
>>
>> http://metrics.dropwizard.io/3.1.0/manual/jetty/
>>
>> From our experience with a similar approach, we might need some
>> Solr-specific metric
>> conflation. SolrJ sends a request to /solr/collection/handler as
>> /solr/collection/select?qt=/handler.
>> In our code, we fix that request to the intended path. We’ve been running
>> a
>> Tomcat metrics search
>> filter for three years.
>>
>> Also, see:
>>
>> https://issues.apache.org/jira/browse/SOLR-8785
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> On Nov 14, 2016, at 3:25 PM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>
>> What do people think about exposing a Collections API call (name TBD,
>> but the sense is PERFORMANCESTATS) that would simply issue the
>> admin/mbeans call to each replica of a collection and report them
>> back. This would give operations monitors the ability to see, say,
>> anomalous replicas that had poor average response times for the last 5
>> minutes and the like.
>>
>> Seems like an easy enhancement that would make ops people's lives easier.
>>
>> I'll raise a JIRA if there's interest, but sure won't make progress on
>> it until I clear my plate of some other JIRAs that I've let linger for
>> far too long.
>>
>> Erick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>

Re: Collection API for performance monitoring?

Reply via email to