[ 
https://issues.apache.org/jira/browse/LUCENE-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880606#comment-16880606
 ] 

Adrien Grand commented on LUCENE-4312:
--------------------------------------

bq. the complexity of query execution would be driven by what's actually in the 
index

I don't think this is true.

For instance an exact phrase query trying to match "A B C" that is currently 
positioned on A (position=3, length=1), B (position=4, length=1), C 
(position=6, length=1) would need to advance B to the next position in case 
there is another match on position 4 that has a length of 2. And then we should 
advance C first because maybe because it also has another match on position 4 
of a different length.

Also we can't advance positions on terms in the order we want anymore. Today we 
use the rarer term to lead the iteration of positions. If we had position 
lengths in the index we would need to advance positions in the order in which 
terms occur in the phrase query since the start position that B must have 
depends on the length of A on the current position: position starts are 
guaranteed to come in order in the index but position ends are not (at least we 
don't enforce it in token streams today).

> Index format to store position length per position
> --------------------------------------------------
>
>                 Key: LUCENE-4312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4312
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 6.0
>            Reporter: Gang Luo
>            Priority: Minor
>              Labels: Suggestion
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Mike Mccandless said:TokenStreams are actually graphs.
> Indexer ignores PositionLengthAttribute.Need change the index format (and 
> Codec APIs) to store an additional int position length per position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to