Hello,

I'd like to resume this post.

The only way I found to do not split synonyms in words in synonyms.txt it
to use the line

 <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"
tokenizerFactory="solr.KeywordTokenizerFactory"/>

in schema.xml

where tokenizerFactory="solr.KeywordTokenizerFactory"

instructs SynonymFilterFactory not to break synonyms into words on white
spaces when parsing synonyms file.

So now it works fine, "mairie" is mapped into "hotel de ville" and when I
send request q="hotel de ville" (quotes are mandatory to prevent analyzer
to split hotel de ville on white spaces), I get answers with word "mairie".

But when I use fq parameter (fq=CATEGORY_ANALYZED:"hotel de ville"), it
doesn't work!!!

CATEGORY_ANALYZED is same field type as default search field. This means
that when I send q="hotel de ville" and fq=CATEGORY_ANALYZED:"hotel de
ville", solr uses the same analyzer, the one with the line

<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"
tokenizerFactory="solr.KeywordTokenizerFactory"/>.

Anyone as a clue what is different between q analysis behaviour and fq
analysis behaviour?

Thanks a lot
Elisabeth

2012/4/12 elisabeth benoit <elisaelisael...@gmail.com>

> oh, that's right.
>
> thanks a lot,
> Elisabeth
>
>
> 2012/4/11 Jeevanandam Madanagopal <je...@myjeeva.com>
>
>> Elisabeth -
>>
>> As you described, below mapping might suit for your need.
>> mairie => hotel de ville, mairie
>>
>> mairie gets expanded to "hotel de ville" and "mairie" at index time.  So
>> "mairie" and "hotel de ville" searchable on document.
>>
>> However, still white space tokenizer splits at query time will be a
>> problem as described by Markus.
>>
>> --Jeevanandam
>>
>> On Apr 11, 2012, at 12:30 PM, elisabeth benoit wrote:
>>
>> > <<Have you tried the "=>' mapping instead? Something
>> > <<like
>> > <<hotel de ville => mairie
>> > <<might work for you.
>> >
>> > Yes, thanks, I've tried it but from what I undestand it doesn't solve my
>> > problem, since this means hotel de ville will be replace by mairie at
>> > index time (I use synonyms only at index time). So when user will ask
>> > "hôtel de ville", it won't match.
>> >
>> > In fact, at index time I have mairie in my data, but I want user to be
>> able
>> > to request "mairie" or "hôtel de ville" and have mairie as answer, and
>> not
>> > have mairie as an answer when requesting "hôtel".
>> >
>> >
>> > <<To map `mairie` to `hotel de ville` as single token you must escape
>> your
>> > white
>> > <<space.
>> >
>> > <<mairie, hotel\ de\ ville
>> >
>> > <<This results in  a problem if your tokenizer splits on white space at
>> > query
>> > <<time.
>> >
>> > Ok, I guess this means I have a problem. No simple solution since at
>> query
>> > time my tokenizer do split on white spaces.
>> >
>> > I guess my problem is more or less one of the problems discussed in
>> >
>> >
>> http://lucene.472066.n3.nabble.com/Multi-word-synonyms-td3716292.html#a3717215
>> >
>> >
>> > Thanks a lot for your answers,
>> > Elisabeth
>> >
>> >
>> >
>> >
>> >
>> > 2012/4/10 Erick Erickson <erickerick...@gmail.com>
>> >
>> >> Have you tried the "=>' mapping instead? Something
>> >> like
>> >> hotel de ville => mairie
>> >> might work for you.
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Tue, Apr 10, 2012 at 1:41 AM, elisabeth benoit
>> >> <elisaelisael...@gmail.com> wrote:
>> >>> Hello,
>> >>>
>> >>> I've read several post on this issue, but can't find a real solution
>> to
>> >> my
>> >>> multi-words synonyms matching problem.
>> >>>
>> >>> I have in my synonyms.txt an entry like
>> >>>
>> >>> mairie, hotel de ville
>> >>>
>> >>> and my index time analyzer is configured as followed for synonyms.
>> >>>
>> >>> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> >>> ignoreCase="true" expand="true"/>
>> >>>
>> >>> The problem I have is that now "mairie" matches with "hotel" and I
>> would
>> >>> only want "mairie" to match with "hotel de ville" and "mairie".
>> >>>
>> >>> When I look into the analyzer, I see that "mairie" is mapped into
>> >> "hotel",
>> >>> and words "de ville" are added in second and third position. To change
>> >>> that, I tried to do
>> >>>
>> >>> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> >>> ignoreCase="true" expand="true"
>> >>> tokenizerFactory="solr.KeywordTokenizerFactory"/> (as I read in one
>> post)
>> >>>
>> >>> and I can see now in the analyzer that "mairie" is mapped to "hotel de
>> >>> ville", but now when I have query "hotel de ville", it doesn't match
>> at
>> >> all
>> >>> with "mairie".
>> >>>
>> >>> Anyone has a clue of what I'm doing wrong?
>> >>>
>> >>> I'm using Solr 3.4.
>> >>>
>> >>> Thanks,
>> >>> Elisabeth
>> >>
>>
>>
>

Reply via email to