You don't necessarily need to store the data in Lucene, but yes it does
need to be stored somewhere. Otherwise, where would the context come
from? If you are not stripping stopwords or stemming or lowercasing or
anything, I suppose you could rebuild it from the index...
To keep from having to retokenize you can check out the TokenSources
class which allows you to use TermVectors to rebuild the
TokenStream....you still need the original text to fragment and
highlight though. Weather you pull that text from a database, the
filesystem, or Lucene, does not matter to the highlighter.
- Mark
DURGA DEEP wrote:
I have a follow up question. Seems like if I want to use highlighting, we
should store the content of the entire document that has to be indexed.
d.add( new Field( FIELD_NAME, "some text", Field.Store.YES,
Field.Index.TOKENIZED) );
Are there better ways of acheiving this ?. Since we have huge data that
needs to be indexed.
Thanks Much
_ddt
On 1/29/08, Mark Miller <[EMAIL PROTECTED]> wrote:
Look at the Highlighter in contrib. It creates fragments (context) and
highlights search terms in them (keywords).
If you want to highlight Phrase's correctly, check out this issue which
adds support for Spans and PhraseQuerys:
https://issues.apache.org/jira/browse/LUCENE-794
Mark
DURGA DEEP wrote:
Dear All,
I've been scouring through the Lucene classes. Are there any
classes which can help me acheive the following ?.
1) We are an e-mail service provider. We wanted to provide a
seach
capability of e-mail messages via Lucene. So far we are able to index/
parse
the e-mail. create the appopriate indexes etc..
Now The customer wants us to have a google like search
capability i.e when they search for a particular word, the word should
be
highlighted as well as the surrounding
text i.e the context in which this word occurs should also
be
shown.
Example : when searching for the word thread.
...crawler is a classic example of Thread in an
poolExecutor
code
an poolExecutor code...
Any help greatly appreciated
+ddt
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]