Re: Confused again ... Getting at results

Erik Hatcher Fri, 09 Dec 2005 16:18:57 -0800


On Dec 9, 2005, at 3:18 PM, Alan Chandler wrote:

I am slowly making may way through lucene, as witnessed by earlierthreads to
this mailing list.
But I am stuck again, going round in circles with the Javadocs.
I want to display the results of a user entered search where foreach document
I put out a small summary with the searched for words highlighted.
When I wrote the Analyzer for my documents, I produced thetokenstream togenerate Token objects with the start end end positions of eachterm in them
Now, from my Hits object I can find each document I need to output,but how do
I get back to the Tokens I originally produced.


Are you using Lucene 1.4.3?  Or the latest Subversion version?

The Lucene index does not keep all of the information in the Token'semitted by the analyzer (unless specified to do so, but 1.4.3 didn'tsupport the fancier features).

So, the fail-safe way is to re-tokenize the original text (perhapsstored in the Lucene index) and hand that TokenStream to theHighlighter.

Or you can experiment with the additional Field constructors toenable the storage of token offsets and the Highlighter can use thosefor a little better performance, but it's likely to be unnoticeablefor your application to simply re-tokenize on the fly for only thefields you're displaying. Storing the token offsets increases theindex size, of course.


        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Confused again ... Getting at results

Reply via email to