[ 
https://issues.apache.org/jira/browse/LUCENE-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186364#comment-13186364
 ] 

Michael McCandless commented on LUCENE-3684:
--------------------------------------------

bq. I think we just need two assertEquals() in TestDuelingCodecs for the future.

Good -- I added that, it found a bug in SimpleText (only codec that can index 
offsets currently...) and I fixed that.

{quote}
For checkindex, long term i think we should really consider adding a (slow, not 
by default) option to verify 
the term vectors against the postings. we could at least turn it on in 
tests..., but thats another separate issue.
{quote}

I added that, in one direction (for each TV it seeks the 
Terms/Docs/AndPositionsEnum to verify everything is the same)... and it 
uncovered a sneaky bug in Lucene3x codec (not present in 3.x) where we were 
failing to make a deep copy of the Term before using it as a key in the terms 
cache... I fixed it.

bq. But i think this is wrong, we must use compareTo >= 0?

Right -- I fixed several places that were still doing == or !=.  I left ones in 
non-SimpleText codecs -- they are still OK since they refuse to index offsets.

bq. I think this code should start with a min3xFormat-1.

Ahh right!  I removed that and just kept FORMAT_FLEX.

Thanks for the reviews Robert and Simon!
                
> Add offsets to postings (D&PEnum)
> ---------------------------------
>
>                 Key: LUCENE-3684
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3684
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3684.patch, LUCENE-3684.patch
>
>
> I think should explore making start/end offsets a first-class attr in the
> postings APIs, and fixing the indexer to index them into postings.
> This will make term vector access cleaner (we now have to jump through
> hoops w/ non-first-class offset attr).  It can also enable efficient
> highlighting without term vectors / reanalyzing, if the app indexes
> offsets into the postings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to