On Tue, May 2, 2023 at 2:34 PM Robert Muir <rcm...@gmail.com> wrote: > > On Tue, May 2, 2023 at 12:49 PM Michael Froh <msf...@gmail.com> wrote: > > > > Hi all, > > > > I was looking into a customer issue where they noticed some increased GC > > time after upgrading from Lucene 7.x to 9.x. After taking some heap dumps > > from both systems, the big difference was tracked down to the float[256] > > allocated (as a norms cache) when creating a BM25Scorer (in > > BM25Similarity.scorer()). > > > > The change seems to have come in with > > https://github.com/apache/lucene/commit/8fd7ead940f69a892dfc951a1aa042e8cae806c1, > > which removed some of the special-case logic around the "non-scoring > > similarity" embedded in IndexSearcher (returned in the false case from the > > old IndexSearcher#scorer(boolean needsScores)). > > > > While I really like that we no longer have that special-case logic in > > IndexSearcher, we now have the issue that every time we create a new > > TermWeight (or other Weight) it allocates a float[256], even if the > > TermWeight doesn't need scores. Also, I think it's the exact same > > float[256] for all non-scoring weights, since it's being computed using the > > same "all 1s" CollectionStatistics and TermStatistics. > > > > (For the record, yes, the queries in question have an obscene number of > > TermQueries, so 1024 bytes times lots of TermWeights, times multiple > > queries running concurrently makes lots of heap allocation.) > > > > I'd like to submit a patch to fix this, but I'm wondering what approach to > > take. One option I'm considering is precomputing a singleton float[256] for > > the non-scoring case (where CollectionStatistics and TermStatistics are all > > 1s). That would have the least functional impact, but would let all > > non-scoring clauses share the same array. Is there a better way to tackle > > this? > > > > This seems ok if it isn't invasive. I still feel like something is > "off" if you are seeing GC time from 1KB-per-segment allocation. Do > you have way too many segments? > > Originally (for various similar reasons) there was a place in the API > to do this, so it would only happen per-Weight instead of per-Scorer, > which was the SimWeight that got eliminated by the commit you point > to. But I'd love if we could steer clear of that complexity: > simplifying the API here was definitely the right move. Its been more > than 5 years since this change was made, and this is the first > complaint i've heard about the 1KB, which is why i asked about your > setup.
One last thought: we should re-check if the cache is still needed :) I think decoding norms used to be more expensive in the past. This cache is now only precomputing part of the bm25 formula to save some add/multiply/divide. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org