If what you describe is the case for range query [* TO *], why would lucene not optimize field:* similar way?
On Wed, Apr 17, 2019 at 10:36 AM Shawn Heisey <apa...@elyograg.org> wrote: > On 4/17/2019 10:51 AM, John Davis wrote: > > Can you clarify why field:[* TO *] is lot more efficient than field:* > > It's a range query. For every document, Lucene just has to answer two > questions -- is the value more than any possible value and is the value > less than any possible value. The answer will be yes if the field > exists, and no if it doesn't. With one million documents, there are two > million questions that Lucene has to answer. Which probably seems like > a lot ... but keep reading. (Side note: It wouldn't surprise me if > Lucene has an optimization specifically for the all inclusive range such > that it actually only asks one question, not two) > > With a wildcard query, there are as many questions as there are values > in the field. Every question is asked for every single document. So if > you have a million documents and there are three hundred thousand > different values contained in the field across the whole index, that's > 300 billion questions. > > Thanks, > Shawn >