Optimizing guarantees that there will be _no_ deleted documents in an index 
when done. If a segment has even one deleted document, it’s merged, no matter 
what you specify for maxSegments. 

Segments are write-once, so to remove deleted data from a segment it must be at 
least rewritten into a new segment, whether or not it’s merged with another 
segment on optimize.

expungeDeletes  does _not_ merge every segment that has deleted documents. It 
merges segments that have > 10% (the default) deleted documents. If your index 
happens to have all segments with > 10% deleted docs, then it will, indeed, 
merge all of them.

In your example, if you look closely you should find that all segments that had 
any deleted documents were written (merged) to new segments. I’d expect that 
segments with _no_ deleted documents might mostly be left alone. And two of the 
segments were chosen to merge together.

See LUCENE-7976 for a long discussion of how this changed starting  with SOLR 
7.5.

Best,
Erick

> On Jun 7, 2019, at 7:07 AM, David Santamauro <david.santama...@gmail.com> 
> wrote:
> 
> Erick, on 6.0.1, optimize with maxSegments only merges down to the specified 
> number. E.g., given an index with 75 segments, optimize with maxSegments=74 
> will only merge 2 segments leaving 74 segments. It will choose a segment to 
> merge that has deleted documents, but does not merge every segment with 
> deleted documents.
> 
> I think you are thinking about the expungeDeletes parameter on the commit 
> request. That will merge every segment that has a deleted document.
> 
> 
> On 6/7/19, 10:00 AM, "Erick Erickson" <erickerick...@gmail.com> wrote:
> 
>    This isn’t quite right. Solr will rewrite _all_ segments that have _any_ 
> deleted documents in them when optimizing, even one. Given your description, 
> I’d guess that all your segments will have deleted documents, so even if you 
> do specify maxSegments on the optimize command, the entire index will be 
> rewritten.
> 
>    You’re in a bind, see: 
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
>  You have this one massive segment and it will _not_ be merged until it’s 
> almost all deleted documents, see the link above for a fuller explanation.
> 
>    Prior to Solr 7.5 you don’t have many options except to re-index and _not_ 
> optimize. So if possible I’d reindex from scratch into a new collection and 
> do not optimize. Or restructure your process such that you can optimize in a 
> quiet period when little indexing is going on.
> 
>    Best,
>    Erick
> 
>> On Jun 7, 2019, at 2:51 AM, jena <sthita2...@gmail.com> wrote:
>> 
>> Thanks @Nicolas Franck for reply, i don't see any any segment info for 4.4
>> version. Is there any API i can use to get my segment information ? Will try
>> to use maxSegments and see if it can help us during optimization.
>> 
>> 
>> 
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 
> 

Reply via email to