Hi Marc, You could try git bisect lucene repository to pinpoint the commit that caused what you're observing. It'll take some time to build but it's a logarithmic bisection and you'd know for sure where the problem is.
D. On Thu, Apr 18, 2024 at 11:16 PM Marc Davenport <madavenp...@cargurus.com.invalid> wrote: > Hi Adrien et al, > I've been doing some investigation today and it looks like whatever the > change is, it happens between 9.4.2 and 9.5.0. > I made a smaller test set up for our code that mocks our documents and just > runs through the indexing portion of our code sending in batches of 4k > documents at a time. This way I can run it locally. > 9.4.2: ~1200-2000 documents per second > 9.5.0: ~150-400 documents per second > > I'll continue investigating, but nothing in the release notes jumped out to > me. > https://lucene.apache.org/core/9_10_0/changes/Changes.html#v9.5.0 > > Sorry I don't have anything more rigorous yet. I'm doing this > investigation in parallel with some other things. > But any insight or suggestions on areas to look would be appreciated. > Thank you, > Marc > > On Wed, Apr 17, 2024 at 4:18 PM Adrien Grand <jpou...@gmail.com> wrote: > > > Hi Marc, > > > > Nothing jumps to mind as a potential cause for this 2x regression. It > would > > be interesting to look at a profile. > > > > On Wed, Apr 17, 2024 at 9:32 PM Marc Davenport > > <madavenp...@cargurus.com.invalid> wrote: > > > > > Hello, > > > I'm finally migrating Lucene from 8.11.2 to 9.10.0 as our overall build > > can > > > now support Java 11. The quick first step of renaming packages and > > > importing the new libraries has gone well. I'm even seeing a nice > > > performance bump in our average query time. I am however seeing a > > dramatic > > > increase in our indexing time. We are indexing ~3.1 million documents > > each > > > with about 100 attributes used for facet filter, and sorting; no > lexical > > > text search. Our indexing time has jumped from ~1k seconds to ~2k > > > seconds. I have yet to profile the individual aspects of how we > convert > > > our data to records vs time for the index writer to accept the > documents. > > > I'm curious if other users discovered this for their migrations at some > > > point. Or if there are some changes to defaults that I did not see in > > the > > > migration guide that would account for this? Looking at the logs I can > > see > > > that as we are indexing the documents we commit every 10 minutes. > > > Thank you, > > > Marc > > > > > > > > > -- > > Adrien > > >