Yes, the mm is 100%. Thank you for a detailed answer.

Regards!
Dalius Sidlauskas

On 21/08/12 15:21, Jack Krupansky wrote:
Solr doesn't actually "know" any natural language, so it has no way of assessing whether two token streams "have the same meaning." In your case, the surface forms/syntax are subtly different - two separate terms vs. a single source term with embedded punctuation.

It appears that you are probbaly using the edismax query parser and probably have "mm" set to "100%" or "q.op" set to "AND" (the "~2" indicates a BooleanQuery with minMatch of 2 terms.) "mm" of 100%" is equivalent to the "AND" operator, some/most of the time.

For the second query you have a "split-term" which is treated as a single term/token until the fieldType analyzer splits it into two terms and then does an "OR" of the sub-terms. Unfortunately, "mm" and "q.op" are not passed down to the analyzer, so you have no way of changing that "OR" to an "AND" - this is why you get different results. But what you can do is set "autoGeneratePhraseQueries="true"" on your field type(s) to cause the query parser to generate a phrase query for "q osona" rather than the "OR". That's not the same as "AND", but depending on your application it may be sufficient or even preferable.

-- Jack Krupansky

-----Original Message----- From: Dalius Sidlauskas
Sent: Tuesday, August 21, 2012 9:35 AM
To: solr-user@lucene.apache.org
Subject: Different queries for same meaning searches

Hello, here is my index and index analyzer configuration:

<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="’|'"
replacement=" "/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>

Search for "d Osona" and "d’Osona" creates "d" and "osona" tokens. But
ParsedQuery is different:

#1 "d Osona"

+((
DisjunctionMaxQuery((search_definitions:d | search_title:d))
DisjunctionMaxQuery((search_definitions:osona | search_title:osona))
)~2)
DisjunctionMaxQuery((search_definitions:"d osona" | search_title:"d
osona"^3.0))

#2 "d’Osona"

+DisjunctionMaxQuery((
(search_definitions:d search_definitions:osona) |
(search_title:d search_title:osona)
))
DisjunctionMaxQuery((search_definitions:"d osona" | search_title:"d
osona"^3.0))


And the results are different as well. Where I can find explanation for
this?


Reply via email to