[jira] [Updated] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

David Smiley (JIRA) Fri, 30 Nov 2012 00:22:07 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


David Smiley updated LUCENE-4574:
---------------------------------

    Attachment: LUCENE-4574.patch

bq. one of the comparators is relevance and also its asked to track scores/max 
scores

Here is a new patch that adds such a flag; I had to rejigger my logic somewhat. 
 There is no wrapping now for OneComparatorNonScoringCollector.setScorer().

What I also did this time, is I removed RelevanceComparator.setScorer()'s 
attempt at wrapping the comparator if it wasn't already wrapped.  Because after 
all we're trying to only wrap when we need to, and the collector is now in 
charge of that.  I added assertions that detect if this comparator is about to 
get the score for the same doc as it last got, and it isn't already a cached 
scorer.  Well; guess what?  Those assertions failed.

TestShardSearching.testSimple() failed this assertion, and it uses 
OneComparatorNonScoringCollector with the RelevanceComparator.
                
> FunctionQuery ValueSource value computed twice per document
> -----------------------------------------------------------
>
>                 Key: LUCENE-4574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4574
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.0, 4.1
>            Reporter: David Smiley
>            Assignee: David Smiley
>         Attachments: LUCENE-4574.patch, LUCENE-4574.patch, LUCENE-4574.patch, 
> Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
>         at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
>         at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
>         at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
>         at org.apache.lucene.search.Scorer.score(Scorer.java:62)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
>         at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
>         at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
>         at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
>         at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
>         at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
>         at org.apache.lucene.search.Scorer.score(Scorer.java:62)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-4574) FunctionQuery ValueSource value computed twice per document

Reply via email to