Hi Jack,

I also run the test with queries that have query terms(with filter too).
Solr5 is faster compare to solr4 in the test. I got the queries set from
our production log, almost all of our queries have filter. So that suggest
to me that it is not the filter query that is slow.

I copy the fq query to the q field (i did not remove fq though), the solr5
is slightly faster than solr 4 for the query

solr4:

<?xml version="1.0" encoding="UTF-8"?><response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">64</int>
      <lst name="params">
         <str name="fl">id</str>
         <str name="start">0</str>
         <str name="q">+categoryIdsPath:1001</str>
         <str name="debug">true</str>
         <str name="fq">+categoryIdsPath:1001</str>
         <str name="rows">2</str>
      </lst>
   </lst>
   <result name="response" numFound="573467" start="0">
      <doc>
         <str name="id">36652255</str>
      </doc>
      <doc>
         <str name="id">36651884</str>
      </doc>
   </result>
   <lst name="debug">
      <str name="rawquerystring">+categoryIdsPath:1001</str>
      <str name="querystring">+categoryIdsPath:1001</str>
      <str name="parsedquery">+categoryIdsPath:1001</str>
      <str name="parsedquery_toString">+categoryIdsPath:1001</str>
      <lst name="explain">
         <str name="36652255">20.451632 = (MATCH)
weight(categoryIdsPath:1001 in 19) [], result of:
  20.451632 = score(doc=19,freq=1.0 = termFreq=1.0
), product of:
    4.522348 = queryWeight, product of:
      4.522348 = idf(docFreq=610392, maxDocs=20670250)
      1.0 = queryNorm
    4.522348 = fieldWeight in 19, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      4.522348 = idf(docFreq=610392, maxDocs=20670250)
      1.0 = fieldNorm(doc=19)</str>
         <str name="36651884">20.451632 = (MATCH)
weight(categoryIdsPath:1001 in 44) [], result of:
  20.451632 = score(doc=44,freq=1.0 = termFreq=1.0
), product of:
    4.522348 = queryWeight, product of:
      4.522348 = idf(docFreq=610392, maxDocs=20670250)
      1.0 = queryNorm
    4.522348 = fieldWeight in 44, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      4.522348 = idf(docFreq=610392, maxDocs=20670250)
      1.0 = fieldNorm(doc=44)</str>
      </lst>
      <str name="QParser">LuceneQParser</str>
      <arr name="filter_queries">
         <str>+categoryIdsPath:1001</str>
      </arr>
      <arr name="parsed_filter_queries">
         <str>+categoryIdsPath:1001</str>
      </arr>
      <lst name="timing">
         <double name="time">63.0</double>
         <lst name="prepare">
            <double name="time">3.0</double>
            <lst name="query">
               <double name="time">3.0</double>
            </lst>
            <lst name="facet">
               <double name="time">0.0</double>
            </lst>
            <lst name="mlt">
               <double name="time">0.0</double>
            </lst>
            <lst name="highlight">
               <double name="time">0.0</double>
            </lst>
            <lst name="stats">
               <double name="time">0.0</double>
            </lst>
            <lst name="debug">
               <double name="time">0.0</double>
            </lst>
         </lst>
         <lst name="process">
            <double name="time">60.0</double>
            <lst name="query">
               <double name="time">57.0</double>
            </lst>
            <lst name="facet">
               <double name="time">0.0</double>
            </lst>
            <lst name="mlt">
               <double name="time">0.0</double>
            </lst>
            <lst name="highlight">
               <double name="time">0.0</double>
            </lst>
            <lst name="stats">
               <double name="time">0.0</double>
            </lst>
            <lst name="debug">
               <double name="time">3.0</double>
            </lst>
         </lst>
      </lst>
   </lst></response>


solr5:

<?xml version="1.0" encoding="UTF-8"?><response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">51</int>
      <lst name="params">
         <str name="fl">id</str>
         <str name="start">0</str>
         <str name="q">+categoryIdsPath:1001</str>
         <str name="debug">true</str>
         <str name="fq">+categoryIdsPath:1001</str>
         <str name="rows">2</str>
      </lst>
   </lst>
   <result name="response" numFound="566462" start="0">
      <doc>
         <str name="id">36652255</str>
      </doc>
      <doc>
         <str name="id">36651884</str>
      </doc>
   </result>
   <lst name="debug">
      <str name="rawquerystring">+categoryIdsPath:1001</str>
      <str name="querystring">+categoryIdsPath:1001</str>
      <str name="parsedquery">+categoryIdsPath:1001</str>
      <str name="parsedquery_toString">+categoryIdsPath:1001</str>
      <lst name="explain">
         <str name="36652255">20.420362 = weight(categoryIdsPath:1001
in 20) [], result of:
  20.420362 = score(doc=20,freq=1.0), product of:
    4.5188894 = queryWeight, product of:
      4.5188894 = idf(docFreq=602005, maxDocs=20315855)
      1.0 = queryNorm
    4.5188894 = fieldWeight in 20, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      4.5188894 = idf(docFreq=602005, maxDocs=20315855)
      1.0 = fieldNorm(doc=20)</str>
         <str name="36651884">20.420362 = weight(categoryIdsPath:1001
in 49) [], result of:
  20.420362 = score(doc=49,freq=1.0), product of:
    4.5188894 = queryWeight, product of:
      4.5188894 = idf(docFreq=602005, maxDocs=20315855)
      1.0 = queryNorm
    4.5188894 = fieldWeight in 49, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      4.5188894 = idf(docFreq=602005, maxDocs=20315855)
      1.0 = fieldNorm(doc=49)</str>
      </lst>
      <str name="QParser">LuceneQParser</str>
      <arr name="filter_queries">
         <str>+categoryIdsPath:1001</str>
      </arr>
      <arr name="parsed_filter_queries">
         <str>+categoryIdsPath:1001</str>
      </arr>
      <lst name="timing">
         <double name="time">51.0</double>
         <lst name="prepare">
            <double name="time">1.0</double>
            <lst name="query">
               <double name="time">1.0</double>
            </lst>
            <lst name="facet">
               <double name="time">0.0</double>
            </lst>
            <lst name="facet_module">
               <double name="time">0.0</double>
            </lst>
            <lst name="mlt">
               <double name="time">0.0</double>
            </lst>
            <lst name="highlight">
               <double name="time">0.0</double>
            </lst>
            <lst name="stats">
               <double name="time">0.0</double>
            </lst>
            <lst name="expand">
               <double name="time">0.0</double>
            </lst>
            <lst name="debug">
               <double name="time">0.0</double>
            </lst>
         </lst>
         <lst name="process">
            <double name="time">50.0</double>
            <lst name="query">
               <double name="time">48.0</double>
            </lst>
            <lst name="facet">
               <double name="time">0.0</double>
            </lst>
            <lst name="facet_module">
               <double name="time">0.0</double>
            </lst>
            <lst name="mlt">
               <double name="time">0.0</double>
            </lst>
            <lst name="highlight">
               <double name="time">0.0</double>
            </lst>
            <lst name="stats">
               <double name="time">0.0</double>
            </lst>
            <lst name="expand">
               <double name="time">0.0</double>
            </lst>
            <lst name="debug">
               <double name="time">2.0</double>
            </lst>
         </lst>
      </lst>
   </lst></response>



On Fri, Nov 6, 2015 at 12:12 PM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Just to be clear, I was suggesting that the filter query (fq) was slow, not
> the MatchAllDocsQuery, which should be just as speedy as before. You can
> test for yourself whether the MADQ by itself is any slower.
>
> You could also test using the fq as the main query (q) - with no fq
> parameter, and see if that is a lot faster, both with old and new Solr.
>
> -- Jack Krupansky
>
> On Fri, Nov 6, 2015 at 3:01 PM, wei <sw90...@gmail.com> wrote:
>
> > Thanks Jack and Shawn. I checked these Jira tickets, but I am not sure if
> > the slowness of MatchAllDocsQuery is also caused by the removal of
> > fieldcache. Can someone please explain a little bit?
> >
> > Thanks,
> > Wei
> >
> > On Fri, Nov 6, 2015 at 7:15 AM, Shawn Heisey <apa...@elyograg.org>
> wrote:
> >
> > > On 11/5/2015 10:25 PM, Jack Krupansky wrote:
> > > > I vaguely recall some discussion concerning removal of the field
> cache
> > in
> > > > Lucene.
> > >
> > > The FieldCache wasn't exactly *removed* ... it's more like it was
> > > renamed, improved, and sort of hidden in a miscellaneous package.  Some
> > > things still require this functionality, so they use the hidden class
> > > instead, which was changed to use the DocValues API.
> > >
> > > https://issues.apache.org/jira/browse/LUCENE-5666
> > >
> > > I am not qualified to discuss LUCENE-5666 beyond what I wrote in the
> > > paragraph above, and it's possible that some of what I said is wrong
> > > because I do not really understand the APIs involved.
> > >
> > > The change has caused problems for Solr.  End result from Solr's
> > > perspective: Certain things which used to work perfectly fine (mostly
> > > facets and grouping) in Solr 4.x have one of two problems in 5.x:
> > > Either they don't work at all, or performance has gone way down.  Some
> > > of these problems are documented in Jira.  These are the issues I know
> > > about:
> > >
> > > https://issues.apache.org/jira/browse/SOLR-8088
> > > https://issues.apache.org/jira/browse/SOLR-7495
> > > https://issues.apache.org/jira/browse/SOLR-8096
> > >
> > > For fields where adding docValues is a viable option (most field types
> > > other than solr.TextField), adding docValues and reindexing is very
> > > likely to solve those problems.
> > >
> > > Sometimes adding docValues won't work, either because the field type
> > > doesn't allow it, or because it's the indexed terms that are needed,
> not
> > > the original field value.  For those situations, there is currently no
> > > solution.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
>

Reply via email to