Adding another detail if it matters...all indexing is happening on standalone Solr on a single core.
On Tue, May 30, 2023 at 5:34 PM Rahul Goswami <[email protected]> wrote: > Hi, > We started experiencing slowness with updates in production after > upgrading from Solr 7.7.2 to 8.11.1. Upon comparing the performance it > turns out that indexing 20 million docs via atomic updates through the same > client program (running 15 parallel threads indexing in batches of 1000) > takes below time: > > Solr 7 : 78 mins > Solr 8: 370 mins > > Environment details: > - Java 11 on Windows server > - Xms1536m Xmx3072m > - Indexing client code running 15 parallel threads indexing in batches of > 1000 > - using SimpleFSDirectoryFactory (since Mmap doesn't quite work well on > Windows for our index sizes which commonly run north of 1 TB) > > Looking at the thread dump, the bottleneck seems to be RealTimeGet and I > can see that Solr 7 takes a different code path than Solr 8. Note that the > performance of regular updates (non-atomic) is still pretty good on Solr 8 > completing in < 1 hour for the same 20 million data set. > > Wondering if others have faced the same on Solr 8 and if this issue > deserves a Jira ticket. > > Sharing the indexing code, solrconfig, schema and thread dumps in the link > below: > > https://drive.google.com/drive/folders/1q2DPNTYQEU6fi3NeXIKJhaoq3KPnms0h?usp=sharing > > Thanks for any inputs! > > -Rahul >
