On Fri, 2001-10-19 at 17:01, Doug Cutting wrote:
> > Rather than highlight terms, I would just extract the first hit token,
> > and a certain number of characters either side of it.
>
> I think this is the best approach.  Since you'll probably only be displaying
> around ten hits at a time, the cost of re-tokenizing is fairly small.
> Please consider contributing your code when it is complete.

I'm trying to implement this and should be able to contribute any
succesful results, but I need to produce context on a per-field basis.
Eg. if I got a token hit in the text body of a document, but the first
hit token was a word in the section title, I'd want to generate context
around the token in the text body.

I had been using a TokenStream to try this. However, lucene's Token
class doesn't seem to have any concept of fields, (even when I
tokenStream() a document that is in the index with a whole bunch of
fields). Is there any reason for this? Moreover, any suggestions of how
to find the information I need?

The natural thing seems to be to have a field-aware token stream, but
I'm not sure how I'd go about implementing that...

Regards,

-- 
Lee Mallabone

Reply via email to