Betim Deva created SOLR-11769:
---------------------------------
Summary: Sorting performance degrades when useFilterForSortedQuery
is enabled and there is no filter query specified
Key: SOLR-11769
URL: https://issues.apache.org/jira/browse/SOLR-11769
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: search
Affects Versions: 4.10.4
Environment: OS: macOS Sierra (version 10.12.4)
Memory: 16GB
CPU: 10.12.4
Java Version: 1.8
Reporter: Betim Deva
The performance of sorting degrades significantly when the
{{useFilterForSortedQuery}} is enabled, and there's no filter query specified.
*Steps to Reproduce:*
1. Set {{useFilterForSortedQuery=true}} in {{solrconfig.xml}}
2. Run a query to match and return a single document. Also add sorting
- Example {{/select?q=foo:123&sort=bar+desc}}
Having a large index (> 10 million documents), this yields to a slow response
(a few hundreds of milliseconds on average) even when the resulting set
consists of a single document.
*Observation 1:*
- Disabling {{useFilterForSortedQuery}} improves the performance to < 1ms
*Observation 2:*
- Removing the {{sort}} improves the performance to < 1ms
*Observation 3:*
- Keeping the {{sort}}, and adding any filter query (such as {{fq=\*:\*}})
improves the performance to < 1 ms.
After profiling
[SolrIndexSearcher.java|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java;h=9ee5199bdf7511c70f2cc616c123292c97d36b5b;hb=HEAD#l1400]
found that the bottleneck is on
{{DocSet bigFilt = getDocSet(cmd.getFilterList());}}
when {{cmd.getFilterList())}} is passed in as {{null}}. This is making
{{getDocSet()}} function collect document ids every single time it is called
without any caching.
{code:java}
1394 if (useFilterCache) {
1395 // now actually use the filter cache.
1396 // for large filters that match few documents, this may be
1397 // slower than simply re-executing the query.
1398 if (out.docSet == null) {
1399 out.docSet = getDocSet(cmd.getQuery(), cmd.getFilter());
1400 DocSet bigFilt = getDocSet(cmd.getFilterList());
1401 if (bigFilt != null) out.docSet = out.docSet.intersection(bigFilt);
1402 }
1403 // todo: there could be a sortDocSet that could take a list of
1404 // the filters instead of anding them first...
1405 // perhaps there should be a multi-docset-iterator
1406 sortDocSet(qr, cmd);
1407 }
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]