[jira] [Commented] (LUCENE-6425) Move extractTerms to Weight

Robert Muir (JIRA) Wed, 15 Apr 2015 14:22:45 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497033#comment-14497033
 ]


Robert Muir commented on LUCENE-6425:
-------------------------------------

Then there are bugs here. 

Apparently there are 3 ways to use termquery, one avoids reseeking the terms 
dict, another lies about docfreq, and other is normal usage. The first two are 
mixing each other up and we have broken statistics... that should only be the 
case for FuzzyLikeThis. so something is really wrong.

> Move extractTerms to Weight
> ---------------------------
>
>                 Key: LUCENE-6425
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6425
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6425.patch, LUCENE-6425.patch
>
>
> Today we have extractTerms on Query, but it is supposed to only be called 
> after the query has been specialized to a given IndexReader using 
> Query.rewrite(IndexReader) to allow some complex queries to replace terms 
> "matchers" with actual terms (eg. WildcardQuery).
> However, we already have an abstraction for indexreader-specialized queries: 
> Weight. So I think it would make more sense to have extractTerms on Weight. 
> This would also remove the trap of calling extractTerms on a query which is 
> not rewritten yet.
> Since Weights know about whether scores are needed or not, I also hope this 
> would help improve the extractTerms semantics. We currently have 2 use-cases 
> for extractTerms: distributed IDF and highlighting. While the former only 
> cares about terms which are used for scoring, it could make sense to 
> highlight terms that were used for matching, even if they did not contribute 
> to the score (eg. if wrapped in a ConstantScoreQuery or a BooleanQuery FILTER 
> clause). So highlighters could do searcher.createNormalizedWeight(query, 
> false).extractTerms(termSet) to get all terms that were used for matching the 
> query while distributed IDF would instead do 
> searcher.createNormalizedWeight(query, true).extractTerms(termSet) to get 
> scoring terms only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6425) Move extractTerms to Weight

Reply via email to