On Apr 29, 2006, at 1:59 AM, Marios Skounakis wrote:
Suppose the user enters the following query using a textbox
interface: "rate based optimization" (as a phrase query, including
the quotes). The query is parsed using QueryParser, then it is
rewritten, and given to the highlighter. Then, method
getBestTextFragments is called.
The method returns some fragments which contain only one of the
words in the search phrase. Isn't this wrong? Since this is a
phrase query, shouldn't the highlighter look for fragments which
contain all three words, and even more, only for fragments in which
the three words are adjascent (based on the token stream returned
by the analyzer)?
"wrong" is subjective in this case. I personally prefer exact
highlighting based on what matched, not just individual term
extraction. I have, in one project, converted all queries to a
SpanQuery and used getSpans() to do highlighting in an accurate way.
This particular code is not generalizable easily and was written
under contract, so I cannot share it, but it actually was not very
complex to do.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]