On Fri, 2016-12-02 at 11:21 +0000, Markus Jelsma wrote:
> Despite the number of actual results, queries with a very high
> facet.limit are three to five times slower compared to much lower
> values. For example, i have a query that returns roughly 19.000 facet
> results. Queries with facet.limit=20000 return within 200 ms but
> queries with facet.limit= 20 million return after around 800 ms. This
> is in a cloud environment.

First all, requesting top.20M facet terms in a multi-node cloud is
really not advisable as the transfer+merge overhead is huge. Have you
considered streaming?

> I vaguely remember an issue where Solr reserves the requested limit,

I looked at both simple String faceting and numeric faceting in Solr.
While there are pre-allocations of the structures involved, they both
have build-in limiting, so the large performance difference that you
are seeing is a bit strange. This was with the Solr 5.4 code that I
happened to have open. Which version are you using?

Just a thought: For plain search, specifying rows=20M is quite
different from rows=20K, as that code does not have the same limiting
as faceting. Are you perchance setting rows together with facet.limit?

- Toke Eskildsen, State and University Library, Denmark

Reply via email to