On 6/27/2018 5:10 AM, Sharif Shahrair wrote:
Now the problem is, when we create about 1400 collection(all of them are
empty i.e. no document is added yet) the solr service goes down showing out
of memory exception. We have few questions here-
1. When we are creating collections, each collection is taking about 8 MB
to 12 MB of memory when there is no document yet. Is there any way to
configure SolrCloud in a way that it takes low memory for each collection
initially(like 1MB for each collection), then we would be able to create
1500 collection using about 3GB of machines RAM?
Solr doesn't dictate how much memory it allocates for a collection. It
allocates what it needs, and if the heap size is too small for that,
then you get OOME.
You're going to need a lot more than two Solr servers to handle that
many collections, and they're going to need more than 12GB of memory.
You should already have at least three servers in your setup, because
ZooKeeper requires three servers for redundancy.
http://zookeeper.apache.org/doc/r3.4.12/zookeeperAdmin.html#sc_zkMulitServerSetup
Handling a large number of collections is one area where SolrCloud needs
improvement. Work is constantly happening towards this goal, but it's a
very complex piece of software, so making design changes is not trivial.
2. Is there any way to clear/flush the cache of SolrCloud, specially from
those collections which we don't access for while(May be we can take those
inactive collections out of memory and load them back when they are needed
again)?
Unfortunately the functionality that allows index cores to be unloaded
(which we have colloquially called "LotsOfCores") does not work when
Solr is running in SolrCloud mode.SolrCloud functionality would break if
its cores get unloaded. It would take a fair amount of development
effort to allow the two features to work together.
3. Is there any way to collect the Garbage Memory from SolrCloud(may be
created by deleting documents and collections) ?
Java handles garbage collection automatically. It's possible to
explicitly ask the system to collect garbage, but any good programming
guide for Java will recommend that programmers should NOT explicitly
trigger GC. While it might be possible for Solr's memory usage to
become more efficient through development effort, it's already pretty
good. To our knowledge, Solr does not currently have any memory leak
bugs, and if any are found, they are taken seriously and fixed as fast
as we can fix them.
Our target is without increasing the hardware resources, create maximum
number of collections, and keeping the highly accessed collections &
documents in memory. We'll appreciate your help.
That goal will require a fair amount of hardware. You may have no
choice but to increase your hardware resources.
Thanks,
Shawn