Well, that made a difference! Now we're back at 64 MB per replica.
Thanks, Markus -----Original message----- > From:Erick Erickson <erickerick...@gmail.com> > Sent: Wednesday 4th October 2017 16:19 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very high number of deleted docs > > Hmmm, OK, I stand corrected. > > This is odd, though. I suspect a quirk in the merging algorithm when > you have a small index.. > > Ahh, wait. What happens if you modify the segments per tier parameter > of TMP? The default is 10, and perhaps because this is such a small > index you don't have very many like-sized segments to merge after your > periodic run. Setting segs per tier to a much lower number (like 2) > might kick in the background merging. It'll make more I/O during > indexing happen of course. > > Best, > Erick > > On Wed, Oct 4, 2017 at 7:09 AM, Markus Jelsma > <markus.jel...@openindex.io> wrote: > > No, that collection never receives a forceMerge nor expungeDeletes. Almost > > all (99.999%) documents are overwritten every 90 minutes. > > > > A single shard has 16k docs (97k total) but is only 300 MB large. Maybe > > that's a problem there. > > > > I can simply turn a switch to forgeMerge after the periodic update cycle, > > but i preferred Lucene to do it for me. > > > > Thanks, > > Markus > > > > -----Original message----- > >> From:Erick Erickson <erickerick...@gmail.com> > >> Sent: Wednesday 4th October 2017 14:56 > >> To: solr-user <solr-user@lucene.apache.org> > >> Subject: Re: Very high number of deleted docs > >> > >> Did you _ever_ do a forceMerge/optimize or expungeDeletes? > >> > >> Here's the problem TieredMergePolicy (TMP) has a maximum segment size > >> it will allow, 5G by default. No segment is even considered for > >> merging unless it has < 2.5G (or half whatever the default is) > >> non-deleted docs, the logic being that to merge similar size segments, > >> each has to be less than half the max size. > >> > >> However, optimize/forceMerge and expungeDeletes do not have a limit on > >> the segment size. So say you optimize at some point and have a 100G > >> segment. It won't get merged until you have 97.5G worth of deleted > >> docs. > >> > >> More here: > >> https://issues.apache.org/jira/browse/LUCENE-7976 > >> > >> Erick > >> > >> On Wed, Oct 4, 2017 at 5:47 AM, Markus Jelsma > >> <markus.jel...@openindex.io> wrote: > >> > Do you mean a periodic forceMerge? That is usually considered a bad > >> > habit on this list (i agree). It is just that i am actually very > >> > surprised this can happen at all with default settings. This factory, > >> > unfortunately does not seem to support settings configured in solrconfig. > >> > > >> > Thanks, > >> > Markus > >> > > >> > -----Original message----- > >> >> From:Amrit Sarkar <sarkaramr...@gmail.com> > >> >> Sent: Wednesday 4th October 2017 14:42 > >> >> To: solr-user@lucene.apache.org > >> >> Subject: Re: Very high number of deleted docs > >> >> > >> >> Hi Markus, > >> >> > >> >> Emir already mentioned tuning *reclaimDeletesWeight which *affects > >> >> segments > >> >> about to merge priority. Optimising index time by time, preferably > >> >> scheduling weekly / fortnight / ..., at low traffic period to never be > >> >> in > >> >> such odd position of 80% deleted docs in total index. > >> >> > >> >> Amrit Sarkar > >> >> Search Engineer > >> >> Lucidworks, Inc. > >> >> 415-589-9269 > >> >> www.lucidworks.com > >> >> Twitter http://twitter.com/lucidworks > >> >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > >> >> > >> >> On Wed, Oct 4, 2017 at 6:02 PM, Emir Arnautović < > >> >> emir.arnauto...@sematext.com> wrote: > >> >> > >> >> > Hi Markus, > >> >> > You can set reclaimDeletesWeight in merge settings to some higher > >> >> > value > >> >> > than default (I think it is 2) to favor segments with deleted docs > >> >> > when > >> >> > merging. > >> >> > > >> >> > HTH, > >> >> > Emir > >> >> > -- > >> >> > Monitoring - Log Management - Alerting - Anomaly Detection > >> >> > Solr & Elasticsearch Consulting Support Training - > >> >> > http://sematext.com/ > >> >> > > >> >> > > >> >> > > >> >> > > On 4 Oct 2017, at 13:31, Markus Jelsma <markus.jel...@openindex.io> > >> >> > wrote: > >> >> > > > >> >> > > Hello, > >> >> > > > >> >> > > Using a 6.6.0, i just spotted one of our collections having a core > >> >> > > of > >> >> > which over 80 % of the total number of documents were deleted > >> >> > documents. > >> >> > > > >> >> > > It has <mergePolicyFactory > >> >> > > class="org.apache.solr.index.TieredMergePolicyFactory"/> > >> >> > configured with no non-default settings. > >> >> > > > >> >> > > Is this supposed to happen? How can i prevent these kind of numbers? > >> >> > > > >> >> > > Thanks, > >> >> > > Markus > >> >> > > >> >> > > >> >> > >> >