The fact that its slow with 100 docs makes me wonder how many values are in the multi-value field?
I'll load up some docs tomorrow with a multi-value field and see how it performs locally. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Sep 1, 2022 at 9:50 PM Michael Gibney <mich...@michaelgibney.net> wrote: > Yes, as you've found, the ordmap is cached via > SlowCompositeReaderWrapper. It's a bit opaque that that's (iiuc?) the > main acceptable use of SlowCompositeReaderWrapper -- as a wrapper > around OrdinalMaps. I'm pretty sure I remember looking into this, and > the `si.mapping` in `findStartAndEndOrds ` does in fact come via the > cachedOrdMaps in SlowCompositeReaderWrapper. > > So I'm surprised you're finding this to be a bottleneck, definitely > worth investigating. If you're using a standalone index and doing jvm > profiling, this issue is probably of limited relevance, but it covers > some similar ground: https://issues.apache.org/jira/browse/SOLR-15008. > > > On Mon, Aug 29, 2022 at 9:21 AM Dawid Weiss <dawid.we...@gmail.com> wrote: > > > > Digging deeper - hmmm... so there is a cache of ords > > in SlowCompositeReaderWrapper: > > > > // TODO: consider ConcurrentHashMap ? > > // TODO: this could really be a weak map somewhere else on the > > coreCacheKey, > > // but do we really need to optimize slow-wrapper any more? > > final Map<String,OrdinalMap> cachedOrdMaps = new HashMap<>(); > > > > I wonder why this doesn't seem to be used from request to request in my > > case, eh. > > > > Dawid > > > > On Mon, Aug 29, 2022 at 3:07 PM Dawid Weiss <dawid.we...@gmail.com> > wrote: > > > > > > > > Hi, > > > > > > I have a situation here with Solr 8.11.2 in stand-alone mode, a large > > > (200GB+) index with multi-valued doc value string fields. The problem > is > > > that faceting over these fields takes a long time. Before you say: > "well, > > > duh, of course" I wanted to point out that it takes a long time for > *every* > > > query, even those that collect facets from a relatively small subset > of all > > > documents (say, one hundred). > > > > > > Looking and debugging the code, I see a few things that made my head > > > scratch but this one is particularly troubling. > > > > > > So, the faceting code goes into FacetFieldProcessorByArrayDV and then > most > > > of the time is spent inside findStartAndEndOrds, looking basically for > the > > > count of unique ordinals (for all segments). Now, because this is done > with > > > a slow reader wrapper, it takes forever. And it's repeated for each and > > > every request - even though clearly the ordinal map (or the count of > > > values) won't change for the same reader: > > > > > > @Override > > > protected void findStartAndEndOrds() throws IOException { > > > if (multiValuedField) { > > > si = FieldUtil.getSortedSetDocValues(fcontext.qcontext, sf, > null); > > > if (si instanceof MultiDocValues.MultiSortedSetDocValues) { > > > ordinalMap = > ((MultiDocValues.MultiSortedSetDocValues)si).mapping; > > > } > > > > > > If there is any rationale behind not caching the ordinal map (or just > the > > > size of all ords!) then I failed to see it. Otherwise it's a serious > > > concern and slowdown that I think could help many poor souls speed up > Solr > > > faceting. > > > > > > Anybody familiar with that code who could comment on the above? > > > > > > Dawid > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > >