Yes, as you've found, the ordmap is cached via SlowCompositeReaderWrapper. It's a bit opaque that that's (iiuc?) the main acceptable use of SlowCompositeReaderWrapper -- as a wrapper around OrdinalMaps. I'm pretty sure I remember looking into this, and the `si.mapping` in `findStartAndEndOrds ` does in fact come via the cachedOrdMaps in SlowCompositeReaderWrapper.
So I'm surprised you're finding this to be a bottleneck, definitely worth investigating. If you're using a standalone index and doing jvm profiling, this issue is probably of limited relevance, but it covers some similar ground: https://issues.apache.org/jira/browse/SOLR-15008. On Mon, Aug 29, 2022 at 9:21 AM Dawid Weiss <dawid.we...@gmail.com> wrote: > > Digging deeper - hmmm... so there is a cache of ords > in SlowCompositeReaderWrapper: > > // TODO: consider ConcurrentHashMap ? > // TODO: this could really be a weak map somewhere else on the > coreCacheKey, > // but do we really need to optimize slow-wrapper any more? > final Map<String,OrdinalMap> cachedOrdMaps = new HashMap<>(); > > I wonder why this doesn't seem to be used from request to request in my > case, eh. > > Dawid > > On Mon, Aug 29, 2022 at 3:07 PM Dawid Weiss <dawid.we...@gmail.com> wrote: > > > > > Hi, > > > > I have a situation here with Solr 8.11.2 in stand-alone mode, a large > > (200GB+) index with multi-valued doc value string fields. The problem is > > that faceting over these fields takes a long time. Before you say: "well, > > duh, of course" I wanted to point out that it takes a long time for *every* > > query, even those that collect facets from a relatively small subset of all > > documents (say, one hundred). > > > > Looking and debugging the code, I see a few things that made my head > > scratch but this one is particularly troubling. > > > > So, the faceting code goes into FacetFieldProcessorByArrayDV and then most > > of the time is spent inside findStartAndEndOrds, looking basically for the > > count of unique ordinals (for all segments). Now, because this is done with > > a slow reader wrapper, it takes forever. And it's repeated for each and > > every request - even though clearly the ordinal map (or the count of > > values) won't change for the same reader: > > > > @Override > > protected void findStartAndEndOrds() throws IOException { > > if (multiValuedField) { > > si = FieldUtil.getSortedSetDocValues(fcontext.qcontext, sf, null); > > if (si instanceof MultiDocValues.MultiSortedSetDocValues) { > > ordinalMap = ((MultiDocValues.MultiSortedSetDocValues)si).mapping; > > } > > > > If there is any rationale behind not caching the ordinal map (or just the > > size of all ords!) then I failed to see it. Otherwise it's a serious > > concern and slowdown that I think could help many poor souls speed up Solr > > faceting. > > > > Anybody familiar with that code who could comment on the above? > > > > Dawid > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org