Does your application see a lot of document updates/deletes?
GITHUB#11761 could have potentially affected you. Whenever I see large
indexing times, my first suspicion is towards increased merge activity.

Regards,
Gautam Worah.


On Thu, Apr 18, 2024 at 2:14 PM Marc Davenport
<madavenp...@cargurus.com.invalid> wrote:

> Hi Adrien et al,
> I've been doing some investigation today and it looks like whatever the
> change is, it happens between 9.4.2 and 9.5.0.
> I made a smaller test set up for our code that mocks our documents and just
> runs through the indexing portion of our code sending in batches of 4k
> documents at a time. This way I can run it locally.
> 9.4.2: ~1200-2000 documents per second
> 9.5.0: ~150-400 documents per second
>
> I'll continue investigating, but nothing in the release notes jumped out to
> me.
> https://lucene.apache.org/core/9_10_0/changes/Changes.html#v9.5.0
>
> Sorry I don't have anything more rigorous yet.  I'm doing this
> investigation in parallel with some other things.
> But any insight or suggestions on areas to look would be appreciated.
> Thank you,
> Marc
>
> On Wed, Apr 17, 2024 at 4:18 PM Adrien Grand <jpou...@gmail.com> wrote:
>
> > Hi Marc,
> >
> > Nothing jumps to mind as a potential cause for this 2x regression. It
> would
> > be interesting to look at a profile.
> >
> > On Wed, Apr 17, 2024 at 9:32 PM Marc Davenport
> > <madavenp...@cargurus.com.invalid> wrote:
> >
> > > Hello,
> > > I'm finally migrating Lucene from 8.11.2 to 9.10.0 as our overall build
> > can
> > > now support Java 11. The quick first step of renaming packages and
> > > importing the new libraries has gone well.  I'm even seeing a nice
> > > performance bump in our average query time. I am however seeing a
> > dramatic
> > > increase in our indexing time.  We are indexing ~3.1 million documents
> > each
> > > with about 100 attributes used for facet filter, and sorting; no
> lexical
> > > text search.  Our indexing time has jumped from ~1k seconds to ~2k
> > > seconds.  I have yet to profile the individual aspects of how we
> convert
> > > our data to records vs time for the index writer to accept the
> documents.
> > > I'm curious if other users discovered this for their migrations at some
> > > point.  Or if there are some changes to defaults that I did not see in
> > the
> > > migration guide that would account for this?  Looking at the logs I can
> > see
> > > that as we are indexing the documents we commit every 10 minutes.
> > > Thank you,
> > > Marc
> > >
> >
> >
> > --
> > Adrien
> >
>

Reply via email to