On 3/16/2017 1:40 PM, Alexandre Rafalovitch wrote:
> Oh. Try your query with quotes around the phone phrase:
> q="one plus one"

That query with the fieldType the user supplied produces this, on 6.3.0
with the lucene parser:

"querystring":"test:\"one plus one\"",
"parsedquery":"MultiPhraseQuery(test:\"(one plus one plus one) plus
one\")", Looks a little odd, but maybe it's correct.
> My hypothesis is:
> Query parser splits things on whitespace before passing it down into
> analyzer chain as individual match attempts. The Analysis UI does not
> take that into account and treats the whole string as phrase sent. You
> say
> outputUnigrams="false" outputUnigramsIfNoShingles="false"
> So, every single token during the query gets ignored because there is
> nothing for it to shingle with.

Might be that.

If I change both of those unigram options to "true" then this is what I
see (also on 6.3.0, q.op is AND):

"querystring":"test:(one plus one)", "parsedquery":"+test:one +test:plus
+test:one",

The really mystifying thing is ... it works on the analysis page.  The
whitespace tokenizer should (in theory at least) produce the same tokens
on the analysis page as the query parser does before analysis, so I have
no idea why analysis and query produce different results.  During query
analysis, the whitespace tokenizer should basically be a no-op, because
the input has already been tokenized.

If I change the analysis to this (keyword instead of whitespace):

        <analyzer type="query">
          <tokenizer class="solr.KeywordTokenizerFactory"/>
          <filter class="solr.LowerCaseFilterFactory"/>
          <filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="5"
                 outputUnigrams="false"
outputUnigramsIfNoShingles="false" />
        </analyzer>

Then the behavior is unchanged:

"querystring":"test:(one plus one)", "parsedquery":"",

> I am not sure why it would have worked in Solr 4.

I just tried it on on 4.9-SNAPSHOT, compiled 2015-05-20 from SVN
revision 1680667, and it doesn't work.  I don't remember whether this
was compiled from branch_4x or from the 4.9 branch.  Before that test, I
had tried back to 5.2.1 with the same results:

"querystring": "test:(one plus one)", "parsedquery": "", Thanks,
Shawn

Reply via email to