[ https://issues.apache.org/jira/browse/SOLR-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236213#comment-17236213 ]
Michael Gibney commented on SOLR-15008: --------------------------------------- Thanks for the extra information; I think I'm no longer confused about the profiling indicating {{OrdinalMap}} as the culprit! The key is that with a 60s autoSoftCommit interval, and 8(shards)x3(replicas/shard)=24 cores/replicas, assuming random drift among the actual commit times on each replica, one of the 24 shards will have a "cold" searcher an average every 2.5 seconds. A distributed request then (by default) randomly picks 8 replicas from among those 24, but a subsequent request (by default) picks a _different_ random 8 replicas. If even _one_ of those replicas has a cold searcher, the entire top-level request would have to wait for the {{OrdinalMap}} on that one replica. There are a couple of approaches that could evaluate/address this hypothesis: # configure stable replica routing using the {{replica.base}} property of the [shards.preference parameter|https://lucene.apache.org/solr/guide/8_7/distributed-requests.html#shards-preference-parameter] (note: I'd strongly recommend this anyway for most use cases, but only for v8.5.2+, because of SOLR-14471). This would not solve the problem, but would decrease the likelihood of picking a "cold" replica. # if this hypothesis is correct, then {{distrib=false}} queries against a specific core/replica should exhibit behavior tied more clearly to the 60s autoSoftCommit interval # ultimately the fix would be to add a nominal facet on the relevant field(s) to one of your configured static warming queries, for the sole purpose of building the {{OrdinalMap}}. I'm curious to know if this helps. I still think "facet on actual values" could help this particular use case (low-cardinality domain, high-cardinality field) -- but I'd expect the underlying issue ({{OrdinalMap}} building) to equally affect the high-cardinality domain use case, so if both use cases are equally helped by configuring a warming query, that may ultimately be the way to go. (If warming queries do indeed help here, the only argument I can see for pursuing facet-by-value would be if you expect to facet on a field _exclusively_ for low-cardinality domains, _and_ the field is sufficiently high-cardinality that either CPU of building {{OrdinalMap}} in a warming query, or memory of keeping it hanging around on the heap, is deemed prohibitively expensive). > Avoid building OrdinalMap for each facet > ---------------------------------------- > > Key: SOLR-15008 > URL: https://issues.apache.org/jira/browse/SOLR-15008 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module > Affects Versions: 8.7 > Reporter: Radu Gheorghe > Priority: Major > Labels: performance > Attachments: Screenshot 2020-11-19 at 12.01.55.png, writes_commits.png > > > I'm running against the following scenario: > * [JSON] faceting on a high cardinality field > * few matching documents => few unique values > Yet the query almost always takes a long time. Here's an example taking > almost 4s for ~300 documents and unique values (edited a bit): > > {code:java} > "QTime":3869, > "params":{ > "json":"{\"query\": \"*:*\", > \"filter\": [\"type:test_type\", \"date:[1603670360 TO 1604361599]\", > \"unique_id:49866\"] > \"facet\": > {\"keywords\":{\"type\":\"terms\",\"field\":\"keywords\",\"limit\":20,\"mincount\":20}}}", > "rows":"0"}}, > > "response":{"numFound":333,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[] > }, > "facets":{ > "count":333, > "keywords":{ > "buckets":[{ > "val":"value1", > "count":124}, > ... > {code} > I did some [profiling with our Sematext > Monitoring|https://sematext.com/docs/monitoring/on-demand-profiling/] and it > points me to OrdinalMap building (see attached screenshot). If I read the > code right, an OrdinalMap is built with every facet. And it's expensive since > there are many unique values in the shard (previously, there we more smaller > shards, making latency better, but this approach doesn't scale for this > particular use-case). > If I'm right up to this point, I see a couple of potential improvements, > [inspired from > Elasticsearch|#search-aggregations-bucket-terms-aggregation-execution-hint]: > # *Keep the OrdinalMap cached until the next softCommit*, so that only the > first query takes the penalty > # *Allow faceting on actual values (a Map) rather than ordinals*, for > situations like the one above where we have few matching documents. We could > potentially auto-detect this scenario (e.g. by configuring a threshold) and > use a Map when there are few documents > I'm curious about what you're thinking: > * would a PR/patch be welcome for any of the two ideas above? > * do you see better options? am I missing something? > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org