Try separating multi word synonyms with a null byte

simple\0syrup,sugar\0syrup,stock\0syrup

see https://issues.apache.org/jira/browse/LUCENE-4499 for details

roman

On Sun, Feb 5, 2012 at 10:31 PM, Zac Smith <z...@trinkit.com> wrote:

> Thanks for your response. When I don't include the KeywordTokenizerFactory
> in the SynonymFilter definition, I get additional term values that I don't
> want.
>
> e.g. synonyms.txt looks like:
> simple syrup,sugar syrup,stock syrup
>
> A document with a value containing 'simple syrup' can now be found when
> searching for just 'stock'.
>
> So the problem I am trying to address with KeywordTokenizerFactory, is to
> prevent my multi word synonyms from getting broken down into single words.
>
> Thanks
> Zac
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Sunday, February 05, 2012 8:07 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Multi word synonyms
>
> I'm not quite sure what you're trying to do with KeywordTokenizerFactory
> in your SynonymFilter definition, but if I use the defaults, then the
> all-phrase form works just fine.
>
> So the question is "what problem are you trying to address by using
> KeywordTokenizerFactory?"
>
> Best
> Erick
>
> On Sun, Feb 5, 2012 at 8:21 AM, O. Klein <kl...@octoweb.nl> wrote:
> > Your query analyser will tokenize "simple sirup" into "simple" and
> "sirup"
> > and wont match on "simple syrup" in the synonyms.txt
> >
> > So you have to change the query analyzer into KeywordTokenizerFactory
> > as well.
> >
> > It might be idea to make a field for synonyms only with this tokenizer
> > and another field to search on and use dismax. Never tried this though.
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Multi-word-synonyms-tp3716292p37172
> > 15.html Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>

Reply via email to