CommonGrams is a tool for this. It makes "is a" into a token, but then "is" and "a" are still removed as stopwords.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.CommonGramsFilterFactory On 3/13/10, Christopher Ball <christopher.b...@metaheuristica.com> wrote: > Thank you for the idea Mitch, but it just doesn't seem right that I should > have to revert to Scoring when what I really need seems so fundamental. > > Logically, what I want is a "phrase filter factory" that would match on > phrases listed in a file, like stopwords, but in this case index the match > and then discard the words of the phrase from the stream before passing it > on to the next filter given the phrases are imbedded in paragraphs which > have other valid index material. > > So an analyzer would look something like: > > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.WordDelimiterFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PhraseFilterFactory "/> > <filter class="solr.StopFilterFactory"/> > </analyzer> > > Of course, one riddle that this leaves us how to match a tokenized stream. . > . so maybe I need to also write my own tokenizer. Just seems like this would > have been a previously desired and solved problem. > > Or may be I should try solr.KeepWordFilterFactory if it can deal with > phrases . . ? > > I'm stumped =( > > -----Original Message----- > From: MitchK [mailto:mitc...@web.de] > Sent: Saturday, March 13, 2010 8:12 AM > To: solr-user@lucene.apache.org > Subject: RE: Index an entire Phrase and not it's constituent parts? > > > Christopher, > > maybe the SynonymFilter can help you to solve your problem. > > Let me try to explain: > If you create an extra field in the index for your use-case, you can boost > matches of them in a special way. > > The next step is creating an extra synonym-file. > as much as => SpecialPhrase1 > in amount of => SpecialPhrase2 > ... and so on... > > If an user wants to query for something like "as much as I love you" you can > do some boosting on matches from the SpecialPhrase-field and you are able to > response results from both: the normal StopWordFiltered data and the > SpecialPhrase-data. > > If this fits your needs, please let me know. > > Kind regards > - Mitch > -- > View this message in context: > http://old.nabble.com/Index-an-entire-Phrase-and-not-it%27s-constituent-part > s--tp27785521p27887564.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > -- Lance Norskog goks...@gmail.com