Re: Performance on faceting using docValues

Toke Eskildsen Thu, 05 Mar 2015 23:59:43 -0800

On Thu, 2015-03-05 at 21:14 +0100, lei wrote:

You present a very interesting observation. I have not noticed what you
describe, but on the other hand we have not done comparative speed
tests.


> q=*:*&fq=country:"US"&fq=category:112

First observation: Your query is '*:*, which is a "magic" query. Non-DV
faceting has optimizations both for this query (although that ought to
be disabled due to the fq) and for the "inverse" case where there are
more hits than non-hits. Perhaps you could test with a handful of
queries, which has different result sizes?

> &facet=on&facet.sort=index&facet.mincount=1&facet.limit=2000

The combination of index order and a high limit might be an explanation:
When resolving the Strings of the facet result, non-DV will perform
ordinal-lookup, which is fast when done in monotonic rising order
(sort=index) and if the values are close (limit=2000). I do not know if
DV benefits the same way.

On the other hand, your limit seems to apply only to material, so it
could be that the real number of unique values is low and you just set
the limit to 2000 to be sure you get everything?

> &facet.field=manufacturer&facet.field=seller&facet.field=material
> &f.manufacturer.facet.mincount=1&f.manufacturer.facet.sort=count&f.manufacturer.facet.limit=100
> &f.seller.facet.mincount=1&f.seller.facet.sort=count&f.seller.facet.limit=100
> &f.material.facet.mincount=1&sort=score+desc

How large is your index in bytes, how many documents does it contain and
is it single-shard or cloud? Could you paste the loglines containing
"UnInverted field", which describes the number of unique values and size
of your facet fields?

- Toke Eskildsen, State and University Library, Denmark

Re: Performance on faceting using docValues

Reply via email to