On 7/20/2017 7:20 AM, Hendrik Haddorp wrote:
> the Solr 6.6. ref guide states that to "finds all documents without a
> value for field" you can use:
> -field:[* TO *]
>
> While this is true I'm wondering why it is recommended to use a range
> query instead of simply:
> -field:*

Performance.

A wildcard is expanded to all possible term values for that field.  If
the field has millions of possible terms, then the query object created
at the Lucene level will quite literally have millions of terms in it. 
No matter how you approach a query with those characteristics, it's
going to be slow, for both getting the terms list and executing the query.

A full range query might be somewhat slow when there are many possible
values, but it's a lot faster than a wildcard in those cases.

If the field is only used by a handful of documents and has very few
possible values, then it might be faster than a range query ... but this
is not common, so the recommended way to do this is with a range query.

Thanks,
Shawn

Reply via email to