Yes, the mm is 100%. Thank you for a detailed answer.

Dalius Sidlauskas

On 21/08/12 15:21, Jack Krupansky wrote:
Solr doesn't actually "know" any natural language, so it has no way of assessing whether two token streams "have the same meaning." In your case, the surface forms/syntax are subtly different - two separate terms vs. a single source term with embedded punctuation.

It appears that you are probbaly using the edismax query parser and probably have "mm" set to "100%" or "q.op" set to "AND" (the "~2" indicates a BooleanQuery with minMatch of 2 terms.) "mm" of 100%" is equivalent to the "AND" operator, some/most of the time.

For the second query you have a "split-term" which is treated as a single term/token until the fieldType analyzer splits it into two terms and then does an "OR" of the sub-terms. Unfortunately, "mm" and "q.op" are not passed down to the analyzer, so you have no way of changing that "OR" to an "AND" - this is why you get different results. But what you can do is set "autoGeneratePhraseQueries="true"" on your field type(s) to cause the query parser to generate a phrase query for "q osona" rather than the "OR". That's not the same as "AND", but depending on your application it may be sufficient or even preferable.

Hello, here is my index and index analyzer configuration:

<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="’|'"
replacement=" "/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>

Search for "d Osona" and "d’Osona" creates "d" and "osona" tokens. But
ParsedQuery is different:

#1 "d Osona"

DisjunctionMaxQuery((search_definitions:d | search_title:d))
DisjunctionMaxQuery((search_definitions:osona | search_title:osona))
DisjunctionMaxQuery((search_definitions:"d osona" | search_title:"d

#2 "d’Osona"

(search_definitions:d search_definitions:osona) |
(search_title:d search_title:osona)
DisjunctionMaxQuery((search_definitions:"d osona" | search_title:"d

And the results are different as well. Where I can find explanation for

