Re: Optimizing fq query performance

John Davis Wed, 17 Apr 2019 09:59:23 -0700

Can you clarify why field:[* TO *] is lot more efficient than field:*

On Sun, Apr 14, 2019 at 12:14 PM Shawn Heisey <apa...@elyograg.org> wrote:


> On 4/13/2019 12:58 PM, John Davis wrote:
> > We noticed a sizable performance degradation when we add certain fq
> filters
> > to the query even though the result set does not change between the two
> > queries. I would've expected solr to optimize internally by picking the
> > most constrained fq filter first, but maybe my understanding is wrong.
>
> All filters cover the entire index, unless the query parser that you're
> using implements the PostFilter interface, the filter cost is set high
> enough, and caching is disabled.  All three of those conditions must be
> met in order for a filter to only run on results instead of the entire
> index.
>
> http://yonik.com/advanced-filter-caching-in-solr/
> https://lucidworks.com/2017/11/27/caching-and-filters-and-post-filters/
>
> Most query parsers don't implement the PostFilter interface.  The lucene
> and edismax parsers do not implement PostFilter.  Unless you've
> specified the query parser in the fq parameter, it will use the lucene
> query parser, and it cannot be a PostFilter.
>
> > Here's an example:
> >
> > query1: fq = 'field1:* AND field2:value'
> > query2: fq = 'field2:value'
>
> If the point of the "field1:*" query clause is "make sure field1 exists
> in the document" then you would be a lot better off with this query clause:
>
> field1:[* TO *]
>
> This is an all-inclusive range query.  It works with all field types
> where I have tried it, and that includes TextField types.   It will be a
> lot more efficient than the wildcard query.
>
> Here's what happens with "field1:*".  If the cardinality of field1 is
> ten million different values, then the query that gets constructed for
> Lucene will literally contain ten million values.  And every single one
> of them will need to be compared to every document.  That's a LOT of
> comparisons.  Wildcard queries are normally very slow.
>
> Thanks,
> Shawn
>

Re: Optimizing fq query performance

Reply via email to