On Friday 13 April 2007 04:04, Erick Erickson wrote: > As I understand it, there really is no "space indicator". I think of it > as replacing the stop word with a space, which is then discarded.
You can replace all stop words by your own special term value to have space indicator. It is also possible to index nothing at a particular position, for example at the position of a stop word. This gives a "gap" in the index, see below. > so, you're indexing 'you find answer', and both your searches are > looking for 'you find answer', the stop words are just gone as though > they never were. So both queries match. > > But I've been wrong before <G>... > > I can't really speak to the highlighter question, so I'll let someone > more knowledgeable pipe up. > > Erick > > On 4/12/07, Bill Taylor <[EMAIL PROTECTED]> wrote: > > > > I found some discussions of this question from back in 2003, but that was > > many updates ago. > > > > I have built an index using the standard stop analyser which uses the > > standard list of stop words. "will" and :the" are stop words. > > > > As I understand analyzers and phrase queries, when I search for > > > > you will find the answer > > > > using the default slop of 0, I should find any pattern like > > > > you <any stop word> find <any stop word> answer > > > > because the analyzer replaces "will" and "the" in the query with a space > > indicator as it did when analyzing the original input text. Instead, I > > find > > phrases such as > > > > you find an answer > > > > "an" is a stop work, so matching "find an answer" is as expected, but > > there > > is no stop word between "you" and "find" in the original input string. I > > do > > not see why "you find an answer" matches. > > > > What am I doing wrong? The problem may be that you expect a gap in the index. When there is a gap in the index, it is also necessary to adapt the analyzer used for the phrase query to query for a gap. I don't know whether PhraseQuery can handle such an analyzer. To have a gap in the index, you need to change your analyzer to add a gap for a stop word. This can be done by changing the position increment when a stop word is encountered, see Token.setPositionIncrement(). Iirc you need to make a variation on StopFilter for this. Regards, Paul Elschot > > > > > > Also, when I try to highlight after searching for a phrase, the > > highlighter > > highlights individual words wherever it finds them in the input text. The > > documentation suggests that if I use the right scoring system, I will > > highlight only long strings of adjacent tokens which are found in the > > phrase, but I am not sure how to do that. > > > > If necessary, I will paste in samples of my code for creating the indexes > > and doing the search. > > > > > > Thanks. > > > > Bill Taylor > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]