On Apr 29, 2006, at 1:59 AM, Marios Skounakis wrote:
Suppose the user enters the following query using a textbox interface: "rate based optimization" (as a phrase query, including the quotes). The query is parsed using QueryParser, then it is rewritten, and given to the highlighter. Then, method getBestTextFragments is called.

The method returns some fragments which contain only one of the words in the search phrase. Isn't this wrong? Since this is a phrase query, shouldn't the highlighter look for fragments which contain all three words, and even more, only for fragments in which the three words are adjascent (based on the token stream returned by the analyzer)?

"wrong" is subjective in this case. I personally prefer exact highlighting based on what matched, not just individual term extraction. I have, in one project, converted all queries to a SpanQuery and used getSpans() to do highlighting in an accurate way. This particular code is not generalizable easily and was written under contract, so I cannot share it, but it actually was not very complex to do.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to