Anria B. <anria_o...@yahoo.com> wrote: > Thanks Toke for this. It gave us a ton to think about, and it really helps > supporting the notion of several smaller indexes over one very large one,> > where we can rather distribute a few JVM processes with less size each, than > have one massive one that is according to this, less efficient.
There are not many clear-cut answers in Solr land... There is a fixed overhead to running a Solr instance and you need to have some wriggle room in the heap for temporary peaks, such as index updates. This calls for a few or only a single Solr instance handling multiple collections. On the other hand, large Java heaps are prone to long stop-the-World garbage collections and there is the memory overhead when exceeding 32GB. Locally we run 50 Solr instances with 8GB heap each, each holding a single shard. At some point I would like to try changing this to 25 instances with 15GB and 2 shards or maybe 12 instances with 28GB and 4 shards. I will not exceed 31GB in a single JVM unless forced to. - Toke Eskildsen