Let's say for the query algorithm, the word algorith is also a match, how do the highlighter know that it should also highlight
occurrences of the word algorith? (I am not sure it does this anyway)

The highlighter knows to highlight stemmed words because both the query terms and the document content are fed through (hopefully) the same analyzer so that "algorithmic", "algorithm", "algorithms" etc become stemmed to the same root form in both query and doc content. The tokens produced by analyzers include the byte offsets of the *original* full word, not just the stemmed form, so the highlighter knows the full extent of what to highlight in text.


Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to