Robert Muir created LUCENE-4299:
-----------------------------------
Summary: No way to find term vectors options at read time
Key: LUCENE-4299
URL: https://issues.apache.org/jira/browse/LUCENE-4299
Project: Lucene - Core
Issue Type: Bug
Reporter: Robert Muir
The problem is simple:
# term vectors can be configured "per-field-per-document", meaning for the
"body" field, document 0 can have them, document 1 maybe doesnt at all,
document 2 maybe has offsets (no positions), and so on. To me this is not a
useful feature at all, no one has ever mentioned a single use case for this,
and it just makes our code more complicated. but it is what it is (for this
issue)
# there is no way to discover these options for a field of a document, you have
to do things like 'peek ahead' to see the first position of the first term is
-1, or same for offsets (except worse, we used to allow anything in offsets so
-1 might be an actual value). This makes the merging code really hairy, and
tough on end consumers.
So I propose that instead of returning Terms for Vectors, we return VectorTerms
(extends Terms), which just adds hasOffsets() and hasPositions(). e.g. lucene40
already knows this from the bits for the field/doc pair and just returns what
it knows.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]