"Some large segments were merged into 12GB segments and deleted documents were physically removed.” and “So with the current natural merge strategy, I need to update solrconfig.xml and increase the maxMergedSegmentMB often"
I strongly recommend you do not continue down this path. You’re making a mountain out of a mole-hill. You have offered no proof that removing the deleted documents is noticeably improving performance. If you replace docs randomly, deleted docs will be removed eventually with the default merge policy without you doing _anything_ special at all. The fact that you think you need to continuously bump up the size of your segments indicates your understanding is incomplete. When you start changing settings basically at random in order to “fix” a problem, especially one that you haven’t demonstrated _is_ a problem, you invariably make the problem worse. By making segments larger, you’ve increased the work Solr (well Lucene) has to do in order to merge them since the merge process has to handle these larger segments. That’ll take longer. There are a fixed number of threads that do merging. If they’re all tied up, incoming updates will block until a thread frees up. I predict that if you continue down this path, eventually your updates will start to misbehave and you’ll spend a week trying to figure out why. If you insist on worrying about deleted documents, just expungeDeletes occasionally. I’d also set the segments size back to the default 5G. I can’t emphasize strongly enough that the way you’re approaching this will lead to problems, not to mention maintenance that is harder than it needs to be. If you do set the max segment size back to 5G, your 12G segments will _not_ merge until they have lots of deletes, making your problem worse. Then you’ll spend time trying to figure out why. Recovering from what you’ve done already has problems. Those large segments _will_ get rewritten (we call it “singleton merge”) when they’ve accumulated a lot of deletes, but meanwhile you’ll think that your problem is getting worse and worse. When those large segments have more than 10% deleted documents, expungeDeletes will singleton merge them and they’ll gradually shrink. So my prescription is: 1> set the max segment size back to 5G 2> monitor your segments. When you see your large segments > 5G have more than 10% deleted documents, issue an expungeDeletes command (not optimize). This will recover your index from the changes you’ve already made. 3> eventually, all your segments will be under 5G. When that happens, stop issuing expungeDeletes. 4> gather some performance statistics and prove one way or another that as deleted docs accumulate over time, it impacts performance. NOTE: after your last expungeDeletes, deleted docs will accumulate over time until they reach a plateau and shouldn’t continue increasing after that. If you can _prove_ that accumulating deleted documents affects performance, institute a regular expungeDeletes. Optimize, but expungeDeletes is less expensive and on a changing index expungeDeletes is sufficient. Optimize is only really useful for a static index, so I’d avoid it in your situation. Best, Erick > On Oct 26, 2020, at 1:22 AM, Moulay Hicham <maratusa.t...@gmail.com> wrote: > > Some large segments were merged into 12GB segments and > deleted documents were physically removed.