> One machine runs with a 3TB drive, running 3 solr processes (each with
one core as described above).

How much total memory on the machine?

Kevin Risden

> Thanks for a quick and detailed response, Erick!
> Unfortunately i don't have a proof, but our servers with solr 4.5 are
> running really nicely with the above config. I had assumed that same  or
> similar settings will also perform well with Solr 7, but that assumption
> didn't hold. As, a lot has changed in 3 major releases.
> I have tweaked the cache values as you suggested but increasing or
> decreasing doesn't seem to do any noticeable improvement.
> At the moment, my one core has 800GB index, ~450 Million documents, 48 G
> Xmx. GC pauses haven't been an issue though.  One machine runs with a 3TB
> drive, running 3 solr processes (each with one core as described above).  I
> agree that it is a very atypical system so i should probably try different
> parameters with a fresh eye to find the solution.
> I tried with autocommits (commit with opensearcher=false very half minute ;
> and softcommit every 5 minutes). That supported the hypothesis that the
> query throughput decreases after opening a new searcher and **not** after
> committing the index . Cache hit ratios are all in 80+% (even when i
> decreased the filterCache to 128, so i will keep it at this lower value).
> Document cache hitratio is really bad, it drops to around 40% after
> newSearcher. But i guess that is expected, since it cannot be warmed up
> anyway.
> Thanks
> Nawab
> > What evidence to you have that the changes you've made to your configs
> > are useful? There's lots of things in here that are suspect:
> >   <double name="forceMergeDeletesPctAllowed">1</double>
> >
> > First, this is useless unless you are forceMerging/optimizing. Which
> > you shouldn't be doing under most circumstances. And you're going to
> > be rewriting a lot of data every time See:
> >
> > https://lucidworks.com/2017/10/13/segment-merging-deleted-
> > documents-optimize-may-bad/
> >
> > filterCache size of size="10240" is far in excess of what we usually
> > recommend. Each entry can be up to maxDoc/8 and you have 10K of them.
> > Why did you choose this? On the theory that "more is better?" If
> > you're using NOW then you may not be using the filterCache well, see:
> >
> > https://lucidworks.com/2012/02/23/date-math-now-and-filter-queries/
> >
> > autowarmCount="1024"
> >
> > Every time you commit you're firing off 1024 queries which is going to
> > spike the CPU a lot. Again, this is super-excessive. I usually start
> > with 16 or so.
> >
> > Why are you committing from a cron job? Why not just set your
> > autocommit settings and forget about it? That's what they're for.
> >
> > Your queryResultCache is likewise kind of large, but it takes up much
> > less space than the filterCache per entry so it's probably OK. I'd
> > still shrink it and set the autowarm to 16 or so to start, unless
> > you're seeing a pretty high hit ratio, which is pretty unusual but
> > does happen.
> >
> > 48G of memory is just asking for long GC pauses. How many docs do you
> > have in each core anyway? If you're really using this much heap, then
> > it'd be good to see what you can do to shrink in. Enabling docValues
> > for all fields you facet, sort or group on will help that a lot if you
> > haven't already.
> >
> > How much memory on your entire machine? And how much is used by _all_
> > the JVMs you running on a particular machine? MMapDirectory needs as
> > much OS memory space as it can get, see:
> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> >
> > Lately we've seen some structures that consume memory until a commit
> > happens (either soft or hard). I'd shrink my autocommit down to 60
> > seconds or even less (openSearcher=false).
> >
> > In short, I'd go back mostly to the default settings and build _up_ as
> > you can demonstrate improvements. You've changed enough things here
> > that untangling which one is the culprit will be hard. You want the
> > JVM to have as little memory as possible, unfortunately that's
> > something you figure out by experimentation.
> >
> > Best,
> > Erick
> >
> > > Hi,
> > >
> > > I am committing every 5 minutes using a periodic cron job  "curl
> > > http://localhost:8984/solr/core1/update?commit=true";. Besides this, my
> > app
> > > doesn't do any soft or hard commits. With Solr 7 upgrade, I am noticing
> > > that query throughput plummets every 5 minutes - probably when the
> commit
> > > happens.
> > > What can I do to improve this? I didn't use to happen like this in
> > solr4.5.
> > > (i.e., i used to get a stable query throughput of  50-60 queries per
> > > second. Now there are spikes to 60 qps interleaved by drops to almost
> > > **0**).  Between those 5 minutes, I am able to achieve high throughput,
> > > hence I guess that issue is related to indexing or merging, and not
> query
> > > flow.
> > >
> > > I have 48G allotted to each solr process, and it seems that only ~50%
> is
> > > being used at any time, similarly CPU is not spiking beyond 50% either.
> > > There is frequent merging (every 5 minute) , but i am not sure if that
> is
> > > a cause of the slowdown.
> > >
> > > Here are my merge and cache settings:
> > >
> > > Thanks
> > > Nawab
> > >
> > > <mergePolicyFactory class="org.apache.solr.index.
> > TieredMergePolicyFactory">
> > >   <int name="maxMergeAtOnce">5</int>
> > >   <int name="segmentsPerTier">5</int>
> > >       <int name="maxMergeAtOnceExplicit">10</int>
> > >       <int name="floorSegmentMB">16</int>
> > >       <!-- 50 gb -->
> > >       <double name="maxMergedSegmentMB">50000</double>
> > >       <double name="forceMergeDeletesPctAllowed">1</double>
> > >
> > >     </mergePolicyFactory>
> > >
> > >
> > >
> > >
> > > <filterCache class="solr.FastLRUCache"
> > >              size="10240"
> > >              initialSize="5120"
> > >              autowarmCount="1024"/>
> > > <queryResultCache class="solr.LRUCache"
> > >                  size="10240"
> > >                  initialSize="5120"
> > >                  autowarmCount="0"/>
> > > <documentCache class="solr.LRUCache"
> > >                size="10240"
> > >                initialSize="5120"
> > >                autowarmCount="0"/>
> > >
> > >
> > > <useColdSearcher>false</useColdSearcher>
> > >
> > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > >
> > > <listener event="newSearcher" class="solr.QuerySenderListener">
> > >   <arr name="queries">
> > >   </arr>
> > > </listener>
> > > <listener event="firstSearcher" class="solr.QuerySenderListener">
> > >   <arr name="queries">
> > >   </arr>
> > > </listener>
