On Jan 8, 2008 11:48 PM, chris.b <[EMAIL PROTECTED]> wrote: > > Wrapping the whitespaceanalyzer with the ngramfilter it creates unigrams > and > the ngrams that i indicate, while maintining the whitespaces. :) > The reason i'm doing this is because I only wish to index names with more > than one token. >
Then I am not sure I understand you. Take this input text: text by John Bear, old. A WhiteSpaceAnalyzer would create these tokens: text by John Bear, old. An NgramFilter(2,2) wrapping it would create these tokens: te ex xt by Jo ... etc. You may use other limits but still no token would have a white space in it.