Re: &fq degrades qtime in a 20million doc collection

Shawn Heisey Wed, 13 Jan 2016 16:01:53 -0800

On 1/13/2016 3:01 PM, Anria B. wrote:
> I have a Really fun question to ask.  I'm sitting here looking at what is by
> far the beefiest box I've ever seen in my life.  256GB of RAM,  extreme
> TerraBytes of disc space, the works.  Linux server properly partitioned 
>
> Yet, what we are seeing goes against all intuition I've built up in the Solr
> world
>
> 1.   Collection has 20-30 million docs.
> 2.   q=*&fq=someField:SomeVal   ---> takes 2.5 seconds
> 3.    q=someField:SomeVal     -->  300ms
> 4.   as numFound -> infinity,     qtime -> infinity.
>
> have any of you encountered such a thing?
>
> that FQ degrades query time by so much?   
>
> it's pure Solr 5.3.1.   ZK + Tomcat 8 + 1shard in solr.  JDK_8u60  All
> running on this same box.


A value of * for your query will be slow.  This is a wildcard query. 
Under the covers, what happens is that Lucene looks up every possible
value in your default field and then does a query for every single one
of those terms.  In an index with 20-30 million documents, this could be
billions of terms.  If you want to query for all documents, do a query
for *:* (star colon star) -- this is a special query string that
literally means "all documents."  It *looks* like it might mean "all
values in all fields" but it's far more specific than that.

Are you saying that you are running Solr 5.3.1 under Tomcat 8?  If so,
this is likely going to be an issue.  The Jetty that's included with
this version is properly tuned for Solr, and the bin/solr start script
will set up good garbage collection tuning.  Running in another
container is almost always a mistake.

Thanks,
Shawn

Re: &fq degrades qtime in a 20million doc collection

Reply via email to