Just for the record, I'd like to conclude this thread

First, you were right, there was no behaviour difference between fq and q
parameters.

I realized that:

1) my synonym (hotel de ville) has a stopword in it (de) and since I used
tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms declaration,
there was no stopword removal in the indewed expression, so when requesting
"hotel de ville", after stopwords removal in query, Solr was comparing
"hotel de ville"
with "hotel ville"

but my queries never even got to that point since

2) I made a mistake using "mairie" alone in the admin interface when
testing my schema. The real field was something like "collectivités
territoriales mairie",
so the synonym "hotel de ville" was not even applied, because of the
tokenizerFactory="solr.KeywordTokenizerFactory" in my synonym definition
not splitting field into words when parsing

So my problem is not solved, and I'm considering solving it outside of Solr
scope, unless someone else has a clue

Thanks again,
Elisabeth



2012/4/25 Erick Erickson <erickerick...@gmail.com>

> A little farther down the debug info output you'll find something
> like this (I specified fq=name:features)
>
> <arr name="parsed_filter_queries">
> <str>name:features</str>
> </arr>
>
>
> so it may well give you some clue. But unless I'm reading things wrong,
> your
> q is going against a field that has much more information than the
> CATEGORY_ANALYZED field, is it possible that the data from your
> test cases simply isn't _in_ CATEGORY_ANALYZED?
>
> Best
> Erick
>
> On Wed, Apr 25, 2012 at 9:39 AM, elisabeth benoit
> <elisaelisael...@gmail.com> wrote:
> > I'm not at the office until next Wednesday, and I don't have my Solr
> under
> > hand, but isn't debugQuery=on giving informations only about q parameter
> > matching and nothing about fq parameter? Or do you mean
> > "parsed_filter_querie"s gives information about fq?
> >
> > CATEGORY_ANALYZED is being populated by a copyField instruction in
> > schema.xml, and has the same field type as my catchall field, the search
> > field for my searchHandler (the one being used by q parameter).
> >
> > CATEGORY (a string) is copied in CATEGORY_ANALYZED (field type is text)
> >
> > CATEGORY (a string) is copied in catchall field (field type is text),
> and a
> > lot of other fields are copied too in that catchall field.
> >
> > So as far as I can see, the same analysis should be done in both cases,
> but
> > obviously I'm missing something, and the only thing I can think of is a
> > different behavior between q and fq parameter.
> >
> > I'll check that parsed_filter_querie first thing in the morning next
> > Wednesday.
> >
> > Thanks a lot for your help.
> >
> > Elisabeth
> >
> >
> > 2012/4/24 Erick Erickson <erickerick...@gmail.com>
> >
> >> Elisabeth:
> >>
> >> What shows up in the debug section of the response when you add
> >> &debugQuery=on? There should be some bit of that section like:
> >> "parsed_filter_queries"
> >>
> >> My other question is "are you absolutely sure that your
> >> CATEGORY_ANALYZED field has the correct content?". How does it
> >> get populated?
> >>
> >> Nothing jumps out at me here....
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Apr 24, 2012 at 9:55 AM, elisabeth benoit
> >> <elisaelisael...@gmail.com> wrote:
> >> > yes, thanks, but this is NOT my question.
> >> >
> >> > I was wondering why I have multiple matches with q="hotel de ville"
> and
> >> no
> >> > match with fq=CATEGORY_ANALYZED:"hotel de ville", since in both case
> I'm
> >> > searching in the same solr fieldType.
> >> >
> >> > Why is q parameter behaving differently in that case? Why do the
> quotes
> >> > work in one case and not in the other?
> >> >
> >> > Does anyone know?
> >> >
> >> > Thanks,
> >> > Elisabeth
> >> >
> >> > 2012/4/24 Jeevanandam <je...@myjeeva.com>
> >> >
> >> >>
> >> >> usage of q and fq
> >> >>
> >> >> q => is typically the main query for the search request
> >> >>
> >> >> fq => is Filter Query; generally used to restrict the super set of
> >> >> documents without influencing score (more info.
> >> >> http://wiki.apache.org/solr/**CommonQueryParameters#q<
> >> http://wiki.apache.org/solr/CommonQueryParameters#q>
> >> >> )
> >> >>
> >> >> For example:
> >> >> ------------
> >> >> q="hotel de ville" ===> returns 100 documents
> >> >>
> >> >> q="hotel de ville"&fq=price:[100 To *]&fq=roomType:"King size Bed"
> ===>
> >> >> returns 40 documents from super set of 100 documents
> >> >>
> >> >>
> >> >> hope this helps!
> >> >>
> >> >> - Jeevanandam
> >> >>
> >> >>
> >> >>
> >> >> On 24-04-2012 3:08 pm, elisabeth benoit wrote:
> >> >>
> >> >>> Hello,
> >> >>>
> >> >>> I'd like to resume this post.
> >> >>>
> >> >>> The only way I found to do not split synonyms in words in
> synonyms.txt
> >> it
> >> >>> to use the line
> >> >>>
> >> >>>  <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt"
> >> >>> ignoreCase="true" expand="true"
> >> >>> tokenizerFactory="solr.**KeywordTokenizerFactory"/>
> >> >>>
> >> >>> in schema.xml
> >> >>>
> >> >>> where tokenizerFactory="solr.**KeywordTokenizerFactory"
> >> >>>
> >> >>> instructs SynonymFilterFactory not to break synonyms into words on
> >> white
> >> >>> spaces when parsing synonyms file.
> >> >>>
> >> >>> So now it works fine, "mairie" is mapped into "hotel de ville" and
> >> when I
> >> >>> send request q="hotel de ville" (quotes are mandatory to prevent
> >> analyzer
> >> >>> to split hotel de ville on white spaces), I get answers with word
> >> >>> "mairie".
> >> >>>
> >> >>> But when I use fq parameter (fq=CATEGORY_ANALYZED:"hotel de
> ville"), it
> >> >>> doesn't work!!!
> >> >>>
> >> >>> CATEGORY_ANALYZED is same field type as default search field. This
> >> means
> >> >>> that when I send q="hotel de ville" and fq=CATEGORY_ANALYZED:"hotel
> de
> >> >>> ville", solr uses the same analyzer, the one with the line
> >> >>>
> >> >>> <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt"
> >> >>> ignoreCase="true" expand="true"
> >> >>> tokenizerFactory="solr.**KeywordTokenizerFactory"/>.
> >> >>>
> >> >>> Anyone as a clue what is different between q analysis behaviour and
> fq
> >> >>> analysis behaviour?
> >> >>>
> >> >>> Thanks a lot
> >> >>> Elisabeth
> >> >>>
> >> >>> 2012/4/12 elisabeth benoit <elisaelisael...@gmail.com>
> >> >>>
> >> >>>  oh, that's right.
> >> >>>>
> >> >>>> thanks a lot,
> >> >>>> Elisabeth
> >> >>>>
> >> >>>>
> >> >>>> 2012/4/11 Jeevanandam Madanagopal <je...@myjeeva.com>
> >> >>>>
> >> >>>>  Elisabeth -
> >> >>>>>
> >> >>>>> As you described, below mapping might suit for your need.
> >> >>>>> mairie => hotel de ville, mairie
> >> >>>>>
> >> >>>>> mairie gets expanded to "hotel de ville" and "mairie" at index
> time.
> >>  So
> >> >>>>> "mairie" and "hotel de ville" searchable on document.
> >> >>>>>
> >> >>>>> However, still white space tokenizer splits at query time will be
> a
> >> >>>>> problem as described by Markus.
> >> >>>>>
> >> >>>>> --Jeevanandam
> >> >>>>>
> >> >>>>> On Apr 11, 2012, at 12:30 PM, elisabeth benoit wrote:
> >> >>>>>
> >> >>>>> > <<Have you tried the "=>' mapping instead? Something
> >> >>>>> > <<like
> >> >>>>> > <<hotel de ville => mairie
> >> >>>>> > <<might work for you.
> >> >>>>> >
> >> >>>>> > Yes, thanks, I've tried it but from what I undestand it doesn't
> >> solve
> >> >>>>> my
> >> >>>>> > problem, since this means hotel de ville will be replace by
> mairie
> >> at
> >> >>>>> > index time (I use synonyms only at index time). So when user
> will
> >> ask
> >> >>>>> > "hôtel de ville", it won't match.
> >> >>>>> >
> >> >>>>> > In fact, at index time I have mairie in my data, but I want user
> >> to be
> >> >>>>> able
> >> >>>>> > to request "mairie" or "hôtel de ville" and have mairie as
> answer,
> >> and
> >> >>>>> not
> >> >>>>> > have mairie as an answer when requesting "hôtel".
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > <<To map `mairie` to `hotel de ville` as single token you must
> >> escape
> >> >>>>> your
> >> >>>>> > white
> >> >>>>> > <<space.
> >> >>>>> >
> >> >>>>> > <<mairie, hotel\ de\ ville
> >> >>>>> >
> >> >>>>> > <<This results in  a problem if your tokenizer splits on white
> >> space
> >> >>>>> at
> >> >>>>> > query
> >> >>>>> > <<time.
> >> >>>>> >
> >> >>>>> > Ok, I guess this means I have a problem. No simple solution
> since
> >> at
> >> >>>>> query
> >> >>>>> > time my tokenizer do split on white spaces.
> >> >>>>> >
> >> >>>>> > I guess my problem is more or less one of the problems
> discussed in
> >> >>>>> >
> >> >>>>> >
> >> >>>>>
> >> >>>>> http://lucene.472066.n3.**nabble.com/Multi-word-**
> >> >>>>> synonyms-td3716292.html#**a3717215<
> >>
> http://lucene.472066.n3.nabble.com/Multi-word-synonyms-td3716292.html#a3717215
> >> >
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > Thanks a lot for your answers,
> >> >>>>> > Elisabeth
> >> >>>>> >
> >> >>>>> >
> >> >>>>> >
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > 2012/4/10 Erick Erickson <erickerick...@gmail.com>
> >> >>>>> >
> >> >>>>> >> Have you tried the "=>' mapping instead? Something
> >> >>>>> >> like
> >> >>>>> >> hotel de ville => mairie
> >> >>>>> >> might work for you.
> >> >>>>> >>
> >> >>>>> >> Best
> >> >>>>> >> Erick
> >> >>>>> >>
> >> >>>>> >> On Tue, Apr 10, 2012 at 1:41 AM, elisabeth benoit
> >> >>>>> >> <elisaelisael...@gmail.com> wrote:
> >> >>>>> >>> Hello,
> >> >>>>> >>>
> >> >>>>> >>> I've read several post on this issue, but can't find a real
> >> solution
> >> >>>>> to
> >> >>>>> >> my
> >> >>>>> >>> multi-words synonyms matching problem.
> >> >>>>> >>>
> >> >>>>> >>> I have in my synonyms.txt an entry like
> >> >>>>> >>>
> >> >>>>> >>> mairie, hotel de ville
> >> >>>>> >>>
> >> >>>>> >>> and my index time analyzer is configured as followed for
> >> synonyms.
> >> >>>>> >>>
> >> >>>>> >>> <filter class="solr.**SynonymFilterFactory"
> >> synonyms="synonyms.txt"
> >> >>>>> >>> ignoreCase="true" expand="true"/>
> >> >>>>> >>>
> >> >>>>> >>> The problem I have is that now "mairie" matches with "hotel"
> and
> >> I
> >> >>>>> would
> >> >>>>> >>> only want "mairie" to match with "hotel de ville" and
> "mairie".
> >> >>>>> >>>
> >> >>>>> >>> When I look into the analyzer, I see that "mairie" is mapped
> into
> >> >>>>> >> "hotel",
> >> >>>>> >>> and words "de ville" are added in second and third position.
> To
> >> >>>>> change
> >> >>>>> >>> that, I tried to do
> >> >>>>> >>>
> >> >>>>> >>> <filter class="solr.**SynonymFilterFactory"
> >> synonyms="synonyms.txt"
> >> >>>>> >>> ignoreCase="true" expand="true"
> >> >>>>> >>> tokenizerFactory="solr.**KeywordTokenizerFactory"/> (as I
> read in
> >> >>>>> one
> >> >>>>> post)
> >> >>>>> >>>
> >> >>>>> >>> and I can see now in the analyzer that "mairie" is mapped to
> >> "hotel
> >> >>>>> de
> >> >>>>> >>> ville", but now when I have query "hotel de ville", it doesn't
> >> match
> >> >>>>> at
> >> >>>>> >> all
> >> >>>>> >>> with "mairie".
> >> >>>>> >>>
> >> >>>>> >>> Anyone has a clue of what I'm doing wrong?
> >> >>>>> >>>
> >> >>>>> >>> I'm using Solr 3.4.
> >> >>>>> >>>
> >> >>>>> >>> Thanks,
> >> >>>>> >>> Elisabeth
> >> >>>>> >>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>
> >>
>

Reply via email to