Mark Miller wrote:
That is why the original contrib does not work with PhraseQuery's. It
simply matches Tokens from the query with those in the TokenStream.
LUCENE-794 takes the TokenStream and shoves it into a MemoryIndex.
Then, after converting the query to a SpanQuery approximation,
getSpans is called on the index for the query. The Spans provide a
bound on what positions should be Highlighted. Everything else is done
exactly like the original Highlighter (This is a patch that fits into
the original Highlighter framework that was developed, thereby
retaining all of its richness :) ).
Mark, thanks for your patience! I have one final (conceptual,
high-level) question concerning the usage of the MemoryIndex index over
the TokenStream. Is it a good idea to
store the procomputed MemoryIndex (conceptually speaking) as a field
into each document at indexing time and then just load this precomputed
index from
disk (as you do with TermVector) such that you save extra computation
for the highlighting?
Marjan.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]