[
https://issues.apache.org/jira/browse/LUCENE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505548#comment-13505548
]
Robert Muir commented on LUCENE-4574:
-------------------------------------
Right, there is more "fixing" needed for the other collectors and other
situations.
But I think solr should still be fixed for the common sort-by-score case.
I don't like the duplicate calls to score. I feel like the API should not
support this. But i don't think caching is the correct solution.
It already frustrates me that there are caches everywhere, for example
BooleanScorer2 has a super-secret score cache just like this.
I have plans to hunt down and kill all such little caches in lucene. Its not
the right solution.
The questions for this one is:
If the user adds relevance as a sort but then also asks to track doc scores/max
scores, how should the collector work?
I definitely don't like the idea of more specialized collectors: god knows
there are already too many, but maybe we can avoid this.
Also: can we speed up this particular query? why is its score so costly?
> FunctionQuery ValueSource value computed twice per document
> -----------------------------------------------------------
>
> Key: LUCENE-4574
> URL: https://issues.apache.org/jira/browse/LUCENE-4574
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 4.0, 4.1
> Reporter: David Smiley
> Attachments: LUCENE-4574.patch, Test_for_LUCENE-4574.patch
>
>
> I was working on a custom ValueSource and did some basic profiling and
> debugging to see if it was being used optimally. To my surprise, the value
> was being fetched twice per document in a row. This computation isn't
> exactly cheap to calculate so this is a big problem. I was able to
> work-around this problem trivially on my end by caching the last value with
> corresponding docid in my FunctionValues implementation.
> Here is an excerpt of the code path to the first execution:
> {noformat}
> at
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:291)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> And here is the 2nd call:
> {noformat}
> at
> org.apache.lucene.queries.function.docvalues.DoubleDocValues.floatVal(DoubleDocValues.java:48)
> at
> org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:153)
> at
> org.apache.lucene.search.ScoreCachingWrappingScorer.score(ScoreCachingWrappingScorer.java:56)
> at
> org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:951)
> at
> org.apache.lucene.search.TopFieldCollector$OneComparatorScoringMaxScoreCollector.collect(TopFieldCollector.java:312)
> at org.apache.lucene.search.Scorer.score(Scorer.java:62)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:588)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
> {noformat}
> The 2nd call appears to use some score caching mechanism, which is all well
> and good, but that same mechanism wasn't used in the first call so there's no
> cached value to retrieve.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]