On 5/4/2015 4:55 AM, Rishi Easwaran wrote: > Sadly with the size of our complex, spiting and adding more HW is not a > viable long term solution. > I guess the options we have are to run optimize regularly and/or become > aggressive in our merges proactively even before solr cloud gets into this > situation.
If you are regularly deleting most of your index, or reindexing large parts of it, which effectively does the same thing, then regular optimizes may be required to keep the index size down, although you must remember that you need enough room for the core to grow in order to actually complete the optimize. If the core is 75-90 percent deleted docs, then you will not need 2x the core size to optimize it, because the new index will be much smaller. Currently, SolrCloud will always optimize the entire collection when you ask for an optimize on any core, but it will NOT optimize all the replicas (cores) at the same time. It will go through the cores that make up the collection and optimize each one one in sequence. If your index is sharded and replicated enough, hopefully that will make it possible for the optimize to complete even though the amount of disk space available may be low. We have at least one issue in Jira where users have asked for optimize to honor distrib=false, which would allow the user to be in complete control of all optimizing, but so far that hasn't been implemented. The volunteers that maintain Solr can only accomplish so much in the limited time they have available. Thanks, Shawn