On 7/20/2017 7:20 AM, Hendrik Haddorp wrote: > the Solr 6.6. ref guide states that to "finds all documents without a > value for field" you can use: > -field:[* TO *] > > While this is true I'm wondering why it is recommended to use a range > query instead of simply: > -field:*
Performance. A wildcard is expanded to all possible term values for that field. If the field has millions of possible terms, then the query object created at the Lucene level will quite literally have millions of terms in it. No matter how you approach a query with those characteristics, it's going to be slow, for both getting the terms list and executing the query. A full range query might be somewhat slow when there are many possible values, but it's a lot faster than a wildcard in those cases. If the field is only used by a handful of documents and has very few possible values, then it might be faster than a range query ... but this is not common, so the recommended way to do this is with a range query. Thanks, Shawn