[ 
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-4574:
---------------------------------

    Attachment: LUCENE-4574.patch

I did some more debugging.  It appears the problem will only occur when 
RelevanceComparator is involved, as it is the only FieldComparator that 
overrides setScorer().  It wraps the scorer with the caching scorer, however 
ideally the passed-in scorer should already be wrapped as such (that is not the 
case).  Even if this comparator is the only one in Lucene that overrides this 
method, there's nothing stopping an app with a custom FieldComparator to do the 
same.  I think the right place to inject the cache is TopFieldCollector's 
subclasses of which there are two, each of which each override setScorer() to 
store a copy of the scorer onto a field.  Each should now look like:

{code:java}
    @Override
    public void setScorer(Scorer scorer) throws IOException {
      scorer = ScoreCachingWrappingScorer.wrap(scorer);
      this.scorer = scorer;
      comparator.setScorer(scorer);
    }
{code}
(the .wrap() method is a convenience I added)

Sound good?  See the attached patch.
                
> FunctionQuery ValueSource value computed twice per document
> -----------------------------------------------------------
>
>                 Key: LUCENE-4574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4574
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.0, 4.1
>            Reporter: David Smiley
>         Attachments: LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and 
> debugging to see if it was being used optimally.  To my surprise, the value 
> was being fetched twice per document in a row.  This computation isn't 
> exactly cheap to calculate so this is a big problem.  I was able to 
> work-around this problem trivially on my end by caching the last value with 
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
>         at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
>         at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
>         at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
>         at org.apache.lucene.search.Scorer.score(Scorer.java:62)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
>         at 
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
>         at 
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
>         at 
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
>         at 
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
>         at 
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
>         at org.apache.lucene.search.Scorer.score(Scorer.java:62)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
>         at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well 
> and good, but that same mechanism wasn't used in the first call so there's no 
> cached value to retrieve.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to