Hi Erik. Yes something like what you describe would do the trick. I did find this:
http://lucene.472066.n3.nabble.com/Concatenate-multiple-tokens-into-one-td1879611.html I might try the pattern replace filter with stopwords, even though that feels kinda clunky. Matt On Wed, Jun 8, 2011 at 11:04 AM, Erik Hatcher <erik.hatc...@gmail.com> wrote: > This seems like it deserves some kind of "collecting" TokenFilter(Factory) > that will slurp up all incoming tokens and glue them together with a space > (and allow separator to be configurable). Hmmm.... surprised one of those > doesn't already exist. With something like that you could have a standard > tokenization chain, and put it all back together at the end. > > Erik > > On Jun 8, 2011, at 10:59 , Matt Mitchell wrote: > >> Hi, >> >> I have an "autocomplete" fieldType that works really well, but because >> the KeywordTokenizerFactory (if I understand correctly) is emitting a >> single token, the stopword filter will not detect any stopwords. >> Anyone know of a way to strip out stopwords when using >> KeywordTokenizerFactory? I did try the reg-exp replace filter, but I'm >> not sure I want to add a bunch of reg-exps for replacing every >> stopword. >> >> Thanks, >> Matt >> >> Here's the fieldType definition: >> >> <fieldType name="autocomplete" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <tokenizer class="solr.KeywordTokenizerFactory"/> >> <filter class="solr.TrimFilterFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.ASCIIFoldingFilterFactory"/> >> >> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" >> maxGramSize="50"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.KeywordTokenizerFactory"/> >> <filter class="solr.TrimFilterFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.ASCIIFoldingFilterFactory"/> >> </analyzer> >> </fieldType> > >