[
https://issues.apache.org/jira/browse/LUCENE-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186364#comment-13186364
]
Michael McCandless commented on LUCENE-3684:
--------------------------------------------
bq. I think we just need two assertEquals() in TestDuelingCodecs for the future.
Good -- I added that, it found a bug in SimpleText (only codec that can index
offsets currently...) and I fixed that.
{quote}
For checkindex, long term i think we should really consider adding a (slow, not
by default) option to verify
the term vectors against the postings. we could at least turn it on in
tests..., but thats another separate issue.
{quote}
I added that, in one direction (for each TV it seeks the
Terms/Docs/AndPositionsEnum to verify everything is the same)... and it
uncovered a sneaky bug in Lucene3x codec (not present in 3.x) where we were
failing to make a deep copy of the Term before using it as a key in the terms
cache... I fixed it.
bq. But i think this is wrong, we must use compareTo >= 0?
Right -- I fixed several places that were still doing == or !=. I left ones in
non-SimpleText codecs -- they are still OK since they refuse to index offsets.
bq. I think this code should start with a min3xFormat-1.
Ahh right! I removed that and just kept FORMAT_FLEX.
Thanks for the reviews Robert and Simon!
> Add offsets to postings (D&PEnum)
> ---------------------------------
>
> Key: LUCENE-3684
> URL: https://issues.apache.org/jira/browse/LUCENE-3684
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-3684.patch, LUCENE-3684.patch
>
>
> I think should explore making start/end offsets a first-class attr in the
> postings APIs, and fixing the indexer to index them into postings.
> This will make term vector access cleaner (we now have to jump through
> hoops w/ non-first-class offset attr). It can also enable efficient
> highlighting without term vectors / reanalyzing, if the app indexes
> offsets into the postings.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]