Elisabeth -
As you described, below mapping might suit for your need.
mairie => hotel de ville, mairie
mairie gets expanded to "hotel de ville" and "mairie" at index
time. So
"mairie" and "hotel de ville" searchable on document.
However, still white space tokenizer splits at query time will be a
problem as described by Markus.
--Jeevanandam
On Apr 11, 2012, at 12:30 PM, elisabeth benoit wrote:
> <<Have you tried the "=>' mapping instead? Something
> <<like
> <<hotel de ville => mairie
> <<might work for you.
>
> Yes, thanks, I've tried it but from what I undestand it doesn't
solve my
> problem, since this means hotel de ville will be replace by
mairie at
> index time (I use synonyms only at index time). So when user will
ask
> "hôtel de ville", it won't match.
>
> In fact, at index time I have mairie in my data, but I want user
to be
able
> to request "mairie" or "hôtel de ville" and have mairie as
answer, and
not
> have mairie as an answer when requesting "hôtel".
>
>
> <<To map `mairie` to `hotel de ville` as single token you must
escape
your
> white
> <<space.
>
> <<mairie, hotel\ de\ ville
>
> <<This results in a problem if your tokenizer splits on white
space at
> query
> <<time.
>
> Ok, I guess this means I have a problem. No simple solution since
at
query
> time my tokenizer do split on white spaces.
>
> I guess my problem is more or less one of the problems discussed
in
>
>
http://lucene.472066.n3.nabble.com/Multi-word-synonyms-td3716292.html#a3717215
>
>
> Thanks a lot for your answers,
> Elisabeth
>
>
>
>
>
> 2012/4/10 Erick Erickson <erickerick...@gmail.com>
>
>> Have you tried the "=>' mapping instead? Something
>> like
>> hotel de ville => mairie
>> might work for you.
>>
>> Best
>> Erick
>>
>> On Tue, Apr 10, 2012 at 1:41 AM, elisabeth benoit
>> <elisaelisael...@gmail.com> wrote:
>>> Hello,
>>>
>>> I've read several post on this issue, but can't find a real
solution
to
>> my
>>> multi-words synonyms matching problem.
>>>
>>> I have in my synonyms.txt an entry like
>>>
>>> mairie, hotel de ville
>>>
>>> and my index time analyzer is configured as followed for
synonyms.
>>>
>>> <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt"
>>> ignoreCase="true" expand="true"/>
>>>
>>> The problem I have is that now "mairie" matches with "hotel"
and I
would
>>> only want "mairie" to match with "hotel de ville" and "mairie".
>>>
>>> When I look into the analyzer, I see that "mairie" is mapped
into
>> "hotel",
>>> and words "de ville" are added in second and third position. To
change
>>> that, I tried to do
>>>
>>> <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt"
>>> ignoreCase="true" expand="true"
>>> tokenizerFactory="solr.KeywordTokenizerFactory"/> (as I read in
one
post)
>>>
>>> and I can see now in the analyzer that "mairie" is mapped to
"hotel de
>>> ville", but now when I have query "hotel de ville", it doesn't
match
at
>> all
>>> with "mairie".
>>>
>>> Anyone has a clue of what I'm doing wrong?
>>>
>>> I'm using Solr 3.4.
>>>
>>> Thanks,
>>> Elisabeth
>>