> From: Pooja Verlani
> Subject: Phrase stopwords
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 23, 2009, 1:15 PM
> Hi,
> Is it possible to have a phrase as a stopword in solr? In
> case, please share
> how to do so?
>
> regards,
> Pooja
>
I think that can be implemented casting/using SynonymFilterFactory and
StopFilterFactory.
syn.txt will contain lines:
phrase as a stopword => somestupidtoken
phrase stopword => somestupidtoken
three words stopword => somestupidtoken
stopwords.txt will contain line:
somestupidtoken
IMO it will work since SynonymFilterFactory can handle multi-word synonyms like
a b c d => foo. With expand="false", you can use this filter to reduce your
multi-word stopwords to a single token (that has a low possibility to occur in
your docuements). Then remove this single token with StopFilter.
This combination will remove multi-word entries in your syn.txt.
Hope this helps.