Hello, I'd like to resume this post.
The only way I found to do not split synonyms in words in synonyms.txt it to use the line <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/> in schema.xml where tokenizerFactory="solr.KeywordTokenizerFactory" instructs SynonymFilterFactory not to break synonyms into words on white spaces when parsing synonyms file. So now it works fine, "mairie" is mapped into "hotel de ville" and when I send request q="hotel de ville" (quotes are mandatory to prevent analyzer to split hotel de ville on white spaces), I get answers with word "mairie". But when I use fq parameter (fq=CATEGORY_ANALYZED:"hotel de ville"), it doesn't work!!! CATEGORY_ANALYZED is same field type as default search field. This means that when I send q="hotel de ville" and fq=CATEGORY_ANALYZED:"hotel de ville", solr uses the same analyzer, the one with the line <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>. Anyone as a clue what is different between q analysis behaviour and fq analysis behaviour? Thanks a lot Elisabeth 2012/4/12 elisabeth benoit <elisaelisael...@gmail.com> > oh, that's right. > > thanks a lot, > Elisabeth > > > 2012/4/11 Jeevanandam Madanagopal <je...@myjeeva.com> > >> Elisabeth - >> >> As you described, below mapping might suit for your need. >> mairie => hotel de ville, mairie >> >> mairie gets expanded to "hotel de ville" and "mairie" at index time. So >> "mairie" and "hotel de ville" searchable on document. >> >> However, still white space tokenizer splits at query time will be a >> problem as described by Markus. >> >> --Jeevanandam >> >> On Apr 11, 2012, at 12:30 PM, elisabeth benoit wrote: >> >> > <<Have you tried the "=>' mapping instead? Something >> > <<like >> > <<hotel de ville => mairie >> > <<might work for you. >> > >> > Yes, thanks, I've tried it but from what I undestand it doesn't solve my >> > problem, since this means hotel de ville will be replace by mairie at >> > index time (I use synonyms only at index time). So when user will ask >> > "hôtel de ville", it won't match. >> > >> > In fact, at index time I have mairie in my data, but I want user to be >> able >> > to request "mairie" or "hôtel de ville" and have mairie as answer, and >> not >> > have mairie as an answer when requesting "hôtel". >> > >> > >> > <<To map `mairie` to `hotel de ville` as single token you must escape >> your >> > white >> > <<space. >> > >> > <<mairie, hotel\ de\ ville >> > >> > <<This results in a problem if your tokenizer splits on white space at >> > query >> > <<time. >> > >> > Ok, I guess this means I have a problem. No simple solution since at >> query >> > time my tokenizer do split on white spaces. >> > >> > I guess my problem is more or less one of the problems discussed in >> > >> > >> http://lucene.472066.n3.nabble.com/Multi-word-synonyms-td3716292.html#a3717215 >> > >> > >> > Thanks a lot for your answers, >> > Elisabeth >> > >> > >> > >> > >> > >> > 2012/4/10 Erick Erickson <erickerick...@gmail.com> >> > >> >> Have you tried the "=>' mapping instead? Something >> >> like >> >> hotel de ville => mairie >> >> might work for you. >> >> >> >> Best >> >> Erick >> >> >> >> On Tue, Apr 10, 2012 at 1:41 AM, elisabeth benoit >> >> <elisaelisael...@gmail.com> wrote: >> >>> Hello, >> >>> >> >>> I've read several post on this issue, but can't find a real solution >> to >> >> my >> >>> multi-words synonyms matching problem. >> >>> >> >>> I have in my synonyms.txt an entry like >> >>> >> >>> mairie, hotel de ville >> >>> >> >>> and my index time analyzer is configured as followed for synonyms. >> >>> >> >>> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> >>> ignoreCase="true" expand="true"/> >> >>> >> >>> The problem I have is that now "mairie" matches with "hotel" and I >> would >> >>> only want "mairie" to match with "hotel de ville" and "mairie". >> >>> >> >>> When I look into the analyzer, I see that "mairie" is mapped into >> >> "hotel", >> >>> and words "de ville" are added in second and third position. To change >> >>> that, I tried to do >> >>> >> >>> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> >>> ignoreCase="true" expand="true" >> >>> tokenizerFactory="solr.KeywordTokenizerFactory"/> (as I read in one >> post) >> >>> >> >>> and I can see now in the analyzer that "mairie" is mapped to "hotel de >> >>> ville", but now when I have query "hotel de ville", it doesn't match >> at >> >> all >> >>> with "mairie". >> >>> >> >>> Anyone has a clue of what I'm doing wrong? >> >>> >> >>> I'm using Solr 3.4. >> >>> >> >>> Thanks, >> >>> Elisabeth >> >> >> >> >