[jira] [Comment Edited] (SOLR-6275) Improve accuracy of QTime reporting

Ramkumar Aiyengar (JIRA) Sun, 27 Jul 2014 21:30:07 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075884#comment-14075884
 ]


Ramkumar Aiyengar edited comment on SOLR-6275 at 7/28/14 4:27 AM:
------------------------------------------------------------------

bq. Hmmm, I'm not sure I agree. QTime is useful for knowing how long the lower 
level stuff took.
Thus it's useful as it stands.

Can you elaborate what you mean by "lower level stuff"? If you are intending 
QTime to reflect time taken by Lucene, i.e. roughly the time taken by the 
top-level collector to collect the results needed, QTime covers way more than 
that already, including time taken for query parsing and finishing up just in 
the non-distributed case. In the distributed case, it also covers time taken 
waiting on the shard handler factory pool, network latency, any servlet 
container pooling time at the processing shards, time taken waiting for the 
federating node to take all responses, and time taken to merge all responses.

QTime as it stands is currently defined from the time the {{SolrQueryRequest}} 
object is created till the response is rendered, which is hard to associate any 
semantic meaning to, and it roughly is all steps except the logic required for 
resolving the core to send the request to. All I am doing is adding that as 
well, to logically mean (again roughly, but less roughly) "time taken by the 
servlet to service the request".

bq. I mean we're talking human-time here. When I'm asking how long a query 
took, my frail human system is not capable of noticing a difference of a 
millisecond or two. nanoTime is not necessarily all that accurate either. Sure, 
it gives a lot or precision, but that's different from accuracy.

The problem is not because of millisecond vs. nanosecond accuracy. 
{{currentTimeMillis}} represents the wall clock of the system and is subject to 
issues like clock skew. For example, if NTP resets the time or if the sys admin 
changes the time for some reason or if some other action changes the wall time, 
the difference between two such measurements can be totally incorrect 
(including being negative). Which is why {{nanoTime}} is preferred for timing 
measurements as it's guaranteed to be monotonic (where the OS supports it, i.e. 
everywhere except some older versions of Windows). See SOLR-5734 for some 
context where we changed all references of this thing, QTime was left out.

bq. Frankly, this doesn't seem worth the change to me. And especially if we 
return the full nanoTime string which would render any current tools for 
reporting performance metrics invalid by a factor 100,000, all for extremely 
questionable utility.

We are still reporting as milliseconds, no change in resolution here. The 
{{RTimer}} utility class used for this purpose already does the conversion.



was (Author: andyetitmoves):
bq. Hmmm, I'm not sure I agree. QTime is useful for knowing how long the lower 
level stuff took.
Thus it's useful as it stands.

Can you elaborate what you mean by "lower level stuff"? If you are intending 
QTime to reflect time taken by Lucene, i.e. roughly the time taken by the 
top-level collector to collect the results needed, QTime covers way more than 
that already, including time taken for query parsing and finishing up just in 
the non-distributed case. In the distributed case, it also covers time taken 
waiting on the shard handler factory pool, network latency, any servlet 
container pooling time at the processing shards, time taken waiting for the 
federating node to take all responses, and time taken to merge all responses.

QTime as it stands is currently defined from the time the {{SolrQueryRequest}} 
object is created till the response is rendered, which is hard to associate any 
semantic meaning to, and it roughly is all steps except the logic required for 
resolving the core to send the request to. All I am doing is adding that as 
well, to logically mean (again roughly, but less roughly) "time taken by the 
servlet to service the request".

bq. I mean we're talking human-time here. When I'm asking how long a query took,
my frail human system is not capable of noticing a difference of a millisecond 
or two.
nanoTime is not necessarily all that accurate either. Sure, it gives a lot or 
precision,
but that's different from accuracy.

The problem is not because of millisecond vs. nanosecond accuracy. 
{{currentTimeMillis}} represents the wall clock of the system and is subject to 
issues like clock skew. For example, if NTP resets the time or if the sys admin 
changes the time for some reason or if some other action changes the wall time, 
the difference between two such measurements can be totally incorrect 
(including being negative). Which is why {{nanoTime}} is preferred for timing 
measurements as it's guaranteed to be monotonic (where the OS supports it, i.e. 
everywhere except some older versions of Windows). See SOLR-5734 for some 
context where we changed all references of this thing, QTime was left out.

bq. Frankly, this doesn't seem worth the change to me. And especially if we 
return the full
nanoTime string which would render any current tools for reporting performance
metrics invalid by a factor 100,000, all for extremely questionable utility.

We are still reporting as milliseconds, no change in resolution here. The 
{{RTimer}} utility class used for this purpose already does the conversion.


> Improve accuracy of QTime reporting
> -----------------------------------
>
>                 Key: SOLR-6275
>                 URL: https://issues.apache.org/jira/browse/SOLR-6275
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Ramkumar Aiyengar
>            Priority: Minor
>
> Currently, {{QTime}} uses {{currentTimeMillis}} instead of {{nanoTime}} and 
> hence is not suitable for time measurements. Further, it is really started 
> after all the dispatch logic in {{SolrDispatchFilter}} (same with the top 
> level timing reported by {{debug=timing}}) which may or may not be expensive, 
> and hence may not fully represent the time taken by the search. This is to 
> remedy both cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6275) Improve accuracy of QTime reporting

Reply via email to