Thank you for all these useful hints!
If I use the multi-valued fields in combination with "modified" position
increments, I would actually distort the shape of a document.
For instance, if I would like to compare a retrieval enforcing query
term co-occurrence within the same sentence with a co-occurrence using
PhraseQuery (or SpanNearQuery) just enforcing that the query terms have
to appear within a text window of n words (also allowing this window to
cross sentence boundaries), I would need to create another index where I
do not modify the position increments.
Is that right?
Best,
Michael
Ian Lea schrieb:
You can use multi valued fields if you play with the position
increment gap. See e.g.
http://lucene.472066.n3.nabble.com/Problem-searching-in-the-same-sentence-td1501269.html
A google search for "lucene indexing sentences" or similar finds that, and more.
Different docs can have different fields/different numbers of fields,
but the position gap approach is probably better.
--
Ian.
On Fri, Mar 4, 2011 at 7:06 AM, Michael Wiegand
<michael.wieg...@lsv.uni-saarland.de> wrote:
Hi,
I would like to create an index with Lucene to a document collections of
text files.
The index should be created in such a way, that for the search I can enforce
that query term A and query term B are contained within the same sentence.
How should implement the index? Should I have for every sentence a different
field (but make sure that it is not a multi-valued field because they would
get merged which is exactly what I do not want)?
Would it be problematic that different documents would then end up having
different numbes of fields?
Thank you in advance!
Best,
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org