Re: Solr shingles is not working in solr 6.4.0

Aman Deep Singh Mon, 20 Mar 2017 23:01:59 -0700

I found a workaround
after configuring the field type as

<fieldType name="cust_shingle" class="solr.TextField" positionIncrementGap=
"100"> <analyzer> <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern=" " replacement=";"/> <tokenizer class=
"solr.PatternTokenizerFactory" pattern=";"/> <filter class=
"solr.ShingleFilterFactory" maxShingleSize="4"/> </analyzer> </fieldType>


So after giving the query as *one\ plus\ one* ,it started the creating the
shingles but for using that i have to give the query with omitting spaces
which is caused some problem in other fields ,Any way to overcome that.


On Fri, Mar 17, 2017 at 9:58 AM Aman Deep Singh <amandeep.coo...@gmail.com>
wrote:

> I also tried in 5.2.1
> for the query
>
> http://localhost:8984/solr/test/select?q=TITLE_SH:one\%20plus\%20one&wt=xml&debugQuery=true
> <http://localhost:8984/solr/test/select?q=TITLE_SH:one%5C%20plus%5C%20one&wt=xml&debugQuery=true>
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> <lst name="params">
> <str name="q">TITLE_SH:one\ plus\ one</str>
> <str name="wt">xml</str>
> <str name="debugQuery">true</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> <lst name="debug">
> <str name="rawquerystring">TITLE_SH:one\ plus\ one</str>
> <str name="querystring">TITLE_SH:one\ plus\ one</str>
> <str name="parsedquery">
> *((TITLE_SH:one plus TITLE_SH:one plus one)/no_coord) TITLE_SH:plus one*
> </str>
> <str name="parsedquery_toString">
> (TITLE_SH:one plus TITLE_SH:one plus one) TITLE_SH:plus one
> </str>
> <lst name="explain"/>
> <str name="QParser">LuceneQParser</str>
>
>
> while in the solr 4.3.1
> query
>
> http://localhost:8983/solr/collection1/select?q=text_sh:one\%20plus\%20one&wt=xml&debugQuery=true
> <http://localhost:8983/solr/collection1/select?q=text_sh:one%5C%20plus%5C%20one&wt=xml&debugQuery=true>
>
> output is like
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">2</int>
> <lst name="params">
> <str name="q">text_sh:one\ plus\ one</str>
> <str name="wt">xml</str>
> <str name="debugQuery">true</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> <lst name="debug">
> <str name="rawquerystring">text_sh:one\ plus\ one</str>
> <str name="querystring">text_sh:one\ plus\ one</str>
> <str name="parsedquery">
> (text_sh:one plus text_sh:one plus one text_sh:plus one)/no_coord
> </str>
> <str name="parsedquery_toString">
> *text_sh:one plus text_sh:one plus one text_sh:plus one*
> </str>
> <lst name="explain"/>
> <str name="QParser">LuceneQParser</str>
>
> On Fri, Mar 17, 2017 at 9:50 AM Shawn Heisey <apa...@elyograg.org> wrote:
>
> On 3/16/2017 1:40 PM, Alexandre Rafalovitch wrote:
> > Oh. Try your query with quotes around the phone phrase:
> > q="one plus one"
>
> That query with the fieldType the user supplied produces this, on 6.3.0
> with the lucene parser:
>
> "querystring":"test:\"one plus one\"",
> "parsedquery":"MultiPhraseQuery(test:\"(one plus one plus one) plus
> one\")", Looks a little odd, but maybe it's correct.
> > My hypothesis is:
> > Query parser splits things on whitespace before passing it down into
> > analyzer chain as individual match attempts. The Analysis UI does not
> > take that into account and treats the whole string as phrase sent. You
> > say
> > outputUnigrams="false" outputUnigramsIfNoShingles="false"
> > So, every single token during the query gets ignored because there is
> > nothing for it to shingle with.
>
> Might be that.
>
> If I change both of those unigram options to "true" then this is what I
> see (also on 6.3.0, q.op is AND):
>
> "querystring":"test:(one plus one)", "parsedquery":"+test:one +test:plus
> +test:one",
>
> The really mystifying thing is ... it works on the analysis page.  The
> whitespace tokenizer should (in theory at least) produce the same tokens
> on the analysis page as the query parser does before analysis, so I have
> no idea why analysis and query produce different results.  During query
> analysis, the whitespace tokenizer should basically be a no-op, because
> the input has already been tokenized.
>
> If I change the analysis to this (keyword instead of whitespace):
>
>         <analyzer type="query">
>           <tokenizer class="solr.KeywordTokenizerFactory"/>
>           <filter class="solr.LowerCaseFilterFactory"/>
>           <filter class="solr.ShingleFilterFactory" minShingleSize="2"
> maxShingleSize="5"
>                  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
>         </analyzer>
>
> Then the behavior is unchanged:
>
> "querystring":"test:(one plus one)", "parsedquery":"",
>
> > I am not sure why it would have worked in Solr 4.
>
> I just tried it on on 4.9-SNAPSHOT, compiled 2015-05-20 from SVN
> revision 1680667, and it doesn't work.  I don't remember whether this
> was compiled from branch_4x or from the 4.9 branch.  Before that test, I
> had tried back to 5.2.1 with the same results:
>
> "querystring": "test:(one plus one)", "parsedquery": "", Thanks,
> Shawn
>
>

Re: Solr shingles is not working in solr 6.4.0

Reply via email to