Re: autoGeneratePhraseQueries not working

Alexandre Rafalovitch Tue, 16 Apr 2019 04:10:08 -0700

The issue is that the Standard Query Parser does pre-processing of the
query and splits it on whitespace beforehand (to deal with all the
special syntax). So, if you don't use quoted phrases then by the time
the field specific query analyzer chain kicks in, the text is already
pre-split and the analyzer only sees one (pre space-separated) token
at a time. So, the autoGeneratePhraseQueries does not work then. If
you use different parsers that send whole text in (e.g. FieldQParser),
then - I think - it will work.


Or, like you discovered, sow=true tells the Standard Query Parser to
send it all together as well.

It is a bit of a messy part of Solr, because the Admin Analysis page
sends the text to the query analyzer without splitting (it does not
use any Query Parser). So, that adds to the confusion.

Regards,
   Alex.

On Tue, 16 Apr 2019 at 10:53, Leonardo Francalanci
<leoonar...@yahoo.it.invalid> wrote:
>
>  To add some information: using "sow=true" it seems to work.But I don't 
> understand why with "sow=false" it wouldn't work (can't find anything in the 
> docs about sow interaction with autoGeneratePhraseQueries); and the 
> implication of setting saw=true.
> I've found this:[SOLR-9185] Solr's edismax and "Lucene"/standard query 
> parsers should optionally not split on whitespace before sending terms to 
> analysis - ASF JIRA
>
> |
> |
> |  |
> [SOLR-9185] Solr's edismax and "Lucene"/standard query parsers should op...
>
>
>  |
>
>  |
>
>  |
>
>
> But it's very low level and I can't find any doc more "user friendly"
>
>     Il martedì 16 aprile 2019, 09:00:08 CEST, Leonardo Francalanci 
> <leoonar...@yahoo.it.INVALID> ha scritto:
>
>  Hi,
>
> I'm using Solr 8.0.0  I can't get autoGeneratePhraseQueries to work (also 
> tried with 7.7.1 and same result):
>
> debug":{
>     "rawquerystring":"TROUBLESHOOT:my25word",
>     "querystring":"TROUBLESHOOT:my25word",
>     "parsedquery":"TROUBLESHOOT:my TROUBLESHOOT:25 TROUBLESHOOT:word",
>     "parsedquery_toString":"TROUBLESHOOT:my TROUBLESHOOT:25 
> TROUBLESHOOT:word",
>
> I expected something like
>
> "parsedquery":"TROUBLESHOOT:"my 25 word"
> Why isn't autoGeneratePhraseQueries generating a quoted string argument when 
> I query???
>
>
> This is my configuration:
>
>       <dynamicField name="*_txt_en_split" type="text_en_splitting"  
> indexed="true"  stored="true"/>
>     <fieldType name="text_en_splitting" class="solr.TextField" 
> positionIncrementGap="100" autoGeneratePhraseQueries="true">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <!-- in this example, we will only use synonyms at query time
>         <filter class="solr.SynonymGraphFilterFactory" 
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>         -->
>         <!-- Case insensitive stop word removal.
>         -->
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="lang/stopwords_en.txt"
>         />
>         <filter class="solr.WordDelimiterGraphFilterFactory" 
> generateWordParts="1" generateNumberParts="1" catenateWords="1" 
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.KeywordMarkerFilterFactory" 
> protected="protwords.txt"/>
>         <filter class="solr.PorterStemFilterFactory"/>
>         <filter class="solr.FlattenGraphFilterFactory" />
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymGraphFilterFactory" 
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="lang/stopwords_en.txt"
>         />
>         <filter class="solr.WordDelimiterGraphFilterFactory" 
> generateWordParts="1" generateNumberParts="1" catenateWords="0" 
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.KeywordMarkerFilterFactory" 
> protected="protwords.txt"/>
>         <filter class="solr.PorterStemFilterFactory"/>
>       </analyzer>
>     </fieldType>
> <field name="TROUBLESHOOT" type="text_en_splitting"  indexed="true" 
> stored="true" multiValued="true" omitNorms="true"/>
>
>

Re: autoGeneratePhraseQueries not working

Reply via email to