On 7/26/2017 1:49 AM, Atita Arora wrote:
> We did our functional and load testing on these boxes , however when we
> released it to production along with the same application (using SolrJ to
> query Solr) , we ran into severe CPU issues.
> Just to add we're on Master - Slave where master has index on
> NRTCachingDirectory
> and Slave on RAMDirectory.
>
> As soon as we placed the slaves under load balancer , under NO LOAD
> condition as well , the slave went from a load of 4 -> 10 -> 16 - > 100 in
> 12 mins.
>
> I suspected this to be caused due to replication but this is never ending ,
> so before this crashed we de-provisioned it and brought it down.
>
> I'm not sure what could possibly cause it.
>
> I looked into the caches , where documentcache , filtercache ,
> queryresultcaches are set to defaults 1024 and 100 documents.
>
> I tried observing the GC activity on GCViewer too , which does'nt really
> shows something alarming (as in what I feel) - a sampler at
> https://pastebin.com/cnuATYrS

What OS is Solr running on?  I'm only asking because some additional
information I'm after has different gathering methods depending on OS. 
Other questions:

Is there only one Solr process per machine, or more than one?
How many total documents are managed by one machine?
How big is all the index data managed by one machine?
What is the max heap on each Solr process?

FYI, RAMDirectory is not the preferred way of running Solr or Lucene. 
If you have enough memory to hold the entire index, it's better to let
the OS handle keeping that information in memory, rather than having
Lucene and Java do it.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

NRTCachingDirectoryFactory uses MMap by default as its delegate
implementation, so your master is fine.

I would be interested in getting a copy of Solr's gc log from a system
with high CPU to look at.

Thanks,
Shawn

Reply via email to