Re: Need urgent help -- High cpu on solr
In addition to the insightful pointers by Zisis and Erick, I would like to mention an approach in the link below that I generally use to pinpoint exactly which threads are causing the CPU spike. Knowing this you can understand which aspect of Solr (search thread, GC, update thread etc) is taking more CPU and develop a mitigation strategy accordingly. (eg: if it's a GC thread, maybe try tuning the params or switch to G1 GC). Just helps to take the guesswork out of the many possible causes. Of course the suggestions received earlier are best practices and should be taken into consideration nevertheless. https://backstage.forgerock.com/knowledge/kb/article/a39551500 The hex number the author talks about in the link above is the native thread id. Best, Rahul On Wed, Oct 14, 2020 at 8:00 AM Erick Erickson wrote: > Zisis makes good points. One other thing is I’d look to > see if the CPU spikes coincide with commits. But GC > is where I’d look first. > > Continuing on with the theme of caches, yours are far too large > at first glance. The default is, indeed, size=512. Every time > you open a new searcher, you’ll be executing 128 queries > for autowarming the filterCache and another 128 for the queryResultCache. > autowarming alone might be accounting for it. I’d reduce > the size back to 512 and an autowarm count nearer 16 > and monitor the cache hit ratio. There’s little or no benefit > in squeezing the last few percent from the hit ratio. If your > hit ratio is small even with the settings you have, then your caches > don’t do you much good anyway so I’d make them much smaller. > > You haven’t told us how often your indexes are > updated, which will be significant CPU hit due to > your autowarming. > > Once you’re done with that, I’d then try reducing the heap. Most > of the actual searching is done in Lucene via MMapDirectory, > which resides in the OS memory space. See: > > https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Finally, if it is GC, consider G1GC if you’re not using that > already. > > Best, > Erick > > > > On Oct 14, 2020, at 7:37 AM, Zisis T. wrote: > > > > The values you have for the caches and the maxwarmingsearchers do not > look > > like the default. Cache sizes are 512 for the most part and > > maxwarmingsearchers are 2 (if not limit them to 2) > > > > Sudden CPU spikes probably indicate GC issues. The # of documents you > have > > is small, are they huge documents? The # of collections is OK in general > but > > since they are crammed in 5 Solr nodes the memory requirements might be > > bigger. Especially if filter and the other caches get populated with 50K > > entries. > > > > I'd first go through the GC activity to make sure that this is not > causing > > the issue. The fact that you lose some Solr servers is also an indicator > of > > large GC pauses that might create a problem when Solr communicates with > > Zookeeper. > > > > > > > > -- > > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html > >
Re: Need urgent help -- High cpu on solr
Zisis makes good points. One other thing is I’d look to see if the CPU spikes coincide with commits. But GC is where I’d look first. Continuing on with the theme of caches, yours are far too large at first glance. The default is, indeed, size=512. Every time you open a new searcher, you’ll be executing 128 queries for autowarming the filterCache and another 128 for the queryResultCache. autowarming alone might be accounting for it. I’d reduce the size back to 512 and an autowarm count nearer 16 and monitor the cache hit ratio. There’s little or no benefit in squeezing the last few percent from the hit ratio. If your hit ratio is small even with the settings you have, then your caches don’t do you much good anyway so I’d make them much smaller. You haven’t told us how often your indexes are updated, which will be significant CPU hit due to your autowarming. Once you’re done with that, I’d then try reducing the heap. Most of the actual searching is done in Lucene via MMapDirectory, which resides in the OS memory space. See: https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Finally, if it is GC, consider G1GC if you’re not using that already. Best, Erick > On Oct 14, 2020, at 7:37 AM, Zisis T. wrote: > > The values you have for the caches and the maxwarmingsearchers do not look > like the default. Cache sizes are 512 for the most part and > maxwarmingsearchers are 2 (if not limit them to 2) > > Sudden CPU spikes probably indicate GC issues. The # of documents you have > is small, are they huge documents? The # of collections is OK in general but > since they are crammed in 5 Solr nodes the memory requirements might be > bigger. Especially if filter and the other caches get populated with 50K > entries. > > I'd first go through the GC activity to make sure that this is not causing > the issue. The fact that you lose some Solr servers is also an indicator of > large GC pauses that might create a problem when Solr communicates with > Zookeeper. > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Need urgent help -- High cpu on solr
The values you have for the caches and the maxwarmingsearchers do not look like the default. Cache sizes are 512 for the most part and maxwarmingsearchers are 2 (if not limit them to 2) Sudden CPU spikes probably indicate GC issues. The # of documents you have is small, are they huge documents? The # of collections is OK in general but since they are crammed in 5 Solr nodes the memory requirements might be bigger. Especially if filter and the other caches get populated with 50K entries. I'd first go through the GC activity to make sure that this is not causing the issue. The fact that you lose some Solr servers is also an indicator of large GC pauses that might create a problem when Solr communicates with Zookeeper. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Need urgent help -- High cpu on solr
I am using solr 8.2 with zoo 3.4, and configured 5 node solr cloud with around 100 collections each collection having ~20k documents. These nodes are vm's with 6 core cpu and 2 cores per socket. All of sudden seeing hikes on CPU's and which brought down some nodes (GONE state on solr cloud and also faced latencies while trying to login to those nodes ssh) Memory : 32GB and 20GB was allotted for jvm heap on solr config. 200 100 true false 4 These are just from the defaults that shipped with SOLR package. One data point is that these nodes gets very frequent hits to them for searching, so do I need to consider increasing the above sizes to get down the CPU's and see more stable solr cloud? -- Thanks & Regards, Yaswanth Kumar Konathala. yaswanth...@gmail.com