[
https://issues.apache.org/jira/browse/LUCENE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-4299:
--------------------------------
Attachment: LUCENE-4299.patch
updated patch fixing a pretty big inefficiency in highlighter, because its
hasPositions(termvectors) was inefficient before, it had to actually clone an
indexinput, read term bytes, freqs, positions, offsets, just to see if the
first pos was -1.
> No way to find term vectors options at read time
> ------------------------------------------------
>
> Key: LUCENE-4299
> URL: https://issues.apache.org/jira/browse/LUCENE-4299
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Attachments: LUCENE-4299.patch, LUCENE-4299.patch, LUCENE-4299.patch,
> LUCENE-4299.patch
>
>
> The problem is simple:
> # term vectors can be configured "per-field-per-document", meaning for the
> "body" field, document 0 can have them, document 1 maybe doesnt at all,
> document 2 maybe has offsets (no positions), and so on. To me this is not a
> useful feature at all, no one has ever mentioned a single use case for this,
> and it just makes our code more complicated. but it is what it is (for this
> issue)
> # there is no way to discover these options for a field of a document, you
> have to do things like 'peek ahead' to see the first position of the first
> term is -1, or same for offsets (except worse, we used to allow anything in
> offsets so -1 might be an actual value). This makes the merging code really
> hairy, and tough on end consumers.
> So I propose that instead of returning Terms for Vectors, we return
> VectorTerms (extends Terms), which just adds hasOffsets() and hasPositions().
> e.g. lucene40 already knows this from the bits for the field/doc pair and
> just returns what it knows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]