Well, that made a difference! Now we're back at 64 MB per replica.

Thanks,
Markus
 
 
-----Original message-----
> From:Erick Erickson <erickerick...@gmail.com>
> Sent: Wednesday 4th October 2017 16:19
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Very high number of deleted docs
> 
> Hmmm, OK,  I stand corrected.
> 
> This is odd, though. I suspect a quirk in the merging algorithm when
> you have a small index..
> 
> Ahh, wait. What happens if you modify the segments per tier parameter
> of TMP? The default is 10, and perhaps because this is such a small
> index you don't have very many like-sized segments to merge after your
> periodic run. Setting segs per tier to a much lower number (like 2)
> might kick in the background merging. It'll make more I/O during
> indexing happen of course.
> 
> Best,
> Erick
> 
> On Wed, Oct 4, 2017 at 7:09 AM, Markus Jelsma
> <markus.jel...@openindex.io> wrote:
> > No, that collection never receives a forceMerge nor expungeDeletes. Almost 
> > all (99.999%) documents are overwritten every 90 minutes.
> >
> > A single shard has 16k docs (97k total) but is only 300 MB large. Maybe 
> > that's a problem there.
> >
> > I can simply turn a switch to forgeMerge after the periodic update cycle, 
> > but i preferred Lucene to do it for me.
> >
> > Thanks,
> > Markus
> >
> > -----Original message-----
> >> From:Erick Erickson <erickerick...@gmail.com>
> >> Sent: Wednesday 4th October 2017 14:56
> >> To: solr-user <solr-user@lucene.apache.org>
> >> Subject: Re: Very high number of deleted docs
> >>
> >> Did you _ever_ do a forceMerge/optimize or expungeDeletes?
> >>
> >> Here's the problem TieredMergePolicy (TMP) has a maximum segment size
> >> it will allow, 5G by default. No segment is even considered for
> >> merging unless it has < 2.5G (or half whatever the default is)
> >> non-deleted docs, the logic being that to merge similar size segments,
> >> each has to be less than half the max size.
> >>
> >> However, optimize/forceMerge and expungeDeletes do not have a limit on
> >> the segment size. So say you optimize at some point and have a 100G
> >> segment. It won't get merged until you have 97.5G worth of deleted
> >> docs.
> >>
> >> More here:
> >> https://issues.apache.org/jira/browse/LUCENE-7976
> >>
> >> Erick
> >>
> >> On Wed, Oct 4, 2017 at 5:47 AM, Markus Jelsma
> >> <markus.jel...@openindex.io> wrote:
> >> > Do you mean a periodic forceMerge? That is usually considered a bad 
> >> > habit on this list (i agree). It is just that i am actually very 
> >> > surprised this can happen at all with default settings. This factory, 
> >> > unfortunately does not seem to support settings configured in solrconfig.
> >> >
> >> > Thanks,
> >> > Markus
> >> >
> >> > -----Original message-----
> >> >> From:Amrit Sarkar <sarkaramr...@gmail.com>
> >> >> Sent: Wednesday 4th October 2017 14:42
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: Re: Very high number of deleted docs
> >> >>
> >> >> Hi Markus,
> >> >>
> >> >> Emir already mentioned tuning *reclaimDeletesWeight which *affects 
> >> >> segments
> >> >> about to merge priority. Optimising index time by time, preferably
> >> >> scheduling weekly / fortnight / ..., at low traffic period to never be 
> >> >> in
> >> >> such odd position of 80% deleted docs in total index.
> >> >>
> >> >> Amrit Sarkar
> >> >> Search Engineer
> >> >> Lucidworks, Inc.
> >> >> 415-589-9269
> >> >> www.lucidworks.com
> >> >> Twitter http://twitter.com/lucidworks
> >> >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> >> >>
> >> >> On Wed, Oct 4, 2017 at 6:02 PM, Emir Arnautović <
> >> >> emir.arnauto...@sematext.com> wrote:
> >> >>
> >> >> > Hi Markus,
> >> >> > You can set reclaimDeletesWeight in merge settings to some higher 
> >> >> > value
> >> >> > than default (I think it is 2) to favor segments with deleted docs 
> >> >> > when
> >> >> > merging.
> >> >> >
> >> >> > HTH,
> >> >> > Emir
> >> >> > --
> >> >> > Monitoring - Log Management - Alerting - Anomaly Detection
> >> >> > Solr & Elasticsearch Consulting Support Training - 
> >> >> > http://sematext.com/
> >> >> >
> >> >> >
> >> >> >
> >> >> > > On 4 Oct 2017, at 13:31, Markus Jelsma <markus.jel...@openindex.io>
> >> >> > wrote:
> >> >> > >
> >> >> > > Hello,
> >> >> > >
> >> >> > > Using a 6.6.0, i just spotted one of our collections having a core 
> >> >> > > of
> >> >> > which over 80 % of the total number of documents were deleted 
> >> >> > documents.
> >> >> > >
> >> >> > > It has <mergePolicyFactory 
> >> >> > > class="org.apache.solr.index.TieredMergePolicyFactory"/>
> >> >> > configured with no non-default settings.
> >> >> > >
> >> >> > > Is this supposed to happen? How can i prevent these kind of numbers?
> >> >> > >
> >> >> > > Thanks,
> >> >> > > Markus
> >> >> >
> >> >> >
> >> >>
> >>
> 

Reply via email to