[ 
https://issues.apache.org/jira/browse/LUCENE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314228#comment-14314228
 ] 

Adrien Grand commented on LUCENE-6226:
--------------------------------------

bq.  Maybe I should just switch it back to the bitset again, but change 
FLAG_POSITIONS so that it doesn't require freqs

I'm not sure how much I like the bitset approach because of all possible 
options that it introduces. Even if a comparable enum is less flexible, I like 
the fact that it is easier to reason about and test (N options instead of N!).

bq. So if you request positions from a Scorer that doesn't support them at all, 
it can just return NO_MORE_POSITIONS immediately.

This sounds error-prone to me, I would really like to have an exception instead.

Something that concerns me about PostingsFeatures.FREQS is that it is not what 
most queries actually need. For instance if you want scores on a boolean query, 
you don't actually need freqs on the underlying queries, but you need scores on 
the other hand, so maybe the name should rather mention scores? I was thinking 
that we could start with the following 3 options:
 - DOCS
 - DOCS_AND_FREQS_AND_SCORES
 - DOCS_AND_FREQS_AND_SCORES_AND_POSITIONS

The first option is like today's needsScores=false and the second one is like 
today's needsScores=true.

The way I see it, both DOCS and DOCS_AND_FREQS_AND_SCORES should never fail. 
Something that is different between freqs/scores and positions is that it is 
usually ok to assume a default value of 1 if scores or freqs do not really make 
sense (eg. this is what ConstantScoreQuery or QueryWrapperFilter are already 
doing today), but it is not the case with positions. So this third option would 
raise an error if positions either do not make sense (eg. doc-values based 
query), are crazy to implement (eg. MatchAll) or are not available (eg. 
TermQuery and positions are not indexed).

Some examples:
 - TermQuery would fail if positions are asked but not indexed
 - whatever a PhraseQuery is asked, it would create sub weight with 
DOCS_AND_FREQS_AND_SCORES, which would in turn fail if any of the sub queries 
does not support positions
 - if a constant-score query is asked for DOCS_AND_FREQS_AND_SCORES, it would 
pass DOCS to the underlying query (like today with needsScores). However if it 
is asked for DOCS_AND_FREQS_AND_SCORES_AND_POSITIONS then it would pass 
DOCS_AND_FREQS_AND_SCORES_AND_POSITIONS to the underlying query

I did not suggest anything for offsets/payloads on purpose to reduce the scope 
for now, but I don't think anything I suggest would be in the way of adding 
them in the future?

> Allow TermScorer to expose positions, offsets and payloads
> ----------------------------------------------------------
>
>                 Key: LUCENE-6226
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6226
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>             Fix For: Trunk, 5.1
>
>         Attachments: LUCENE-6226.patch, LUCENE-6226.patch, LUCENE-6226.patch, 
> LUCENE-6226.patch, LUCENE-6226.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to