RE: Very high number of deleted docs, part 2

2018-01-11 Thread Markus Jelsma
ick Erickson <erickerick...@gmail.com> > Sent: Wednesday 10th January 2018 22:41 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very high number of deleted docs, part 2 > > There's some background here: > https://lucidworks.com/2017/10/13/segment-merging-d

Re: Very high number of deleted docs, part 2

2018-01-10 Thread Erick Erickson
;erickerick...@gmail.com> > > Sent: Friday 5th January 2018 17:56 > > To: solr-user <solr-user@lucene.apache.org> > > Subject: Re: Very high number of deleted docs, part 2 > > > > I'm not 100% sure that playing with maxSegments will work. > > > &

RE: Very high number of deleted docs, part 2

2018-01-10 Thread Markus Jelsma
y 2018 17:56 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very high number of deleted docs, part 2 > > I'm not 100% sure that playing with maxSegments will work. > > what will work is to re-index everything. You can re-index into the > existing collection,

Re: Very high number of deleted docs, part 2

2018-01-05 Thread Erick Erickson
w about optimizing it again, with maxSegments set to ten, it should > recover right? > > -Original message- > > From:Shawn Heisey <apa...@elyograg.org> > > Sent: Friday 5th January 2018 14:34 > > To: solr-user@lucene.apache.org > > Subject: Re: Very hig

RE: Very high number of deleted docs, part 2

2018-01-05 Thread Markus Jelsma
: Friday 5th January 2018 14:34 > To: solr-user@lucene.apache.org > Subject: Re: Very high number of deleted docs, part 2 > > On 1/5/2018 5:33 AM, Markus Jelsma wrote: > > Another collection, now on 7.1, also shows this problem and has default TMP > > settings. Thi

Re: Very high number of deleted docs, part 2

2018-01-05 Thread Shawn Heisey
On 1/5/2018 5:33 AM, Markus Jelsma wrote: Another collection, now on 7.1, also shows this problem and has default TMP settings. This time size is different, each shard of this collection is over 40 GB, and each shard has about 50 % deleted documents. Each shard's largest segment is just under

RE: Very high number of deleted docs

2017-10-04 Thread Markus Jelsma
Well, that made a difference! Now we're back at 64 MB per replica. Thanks, Markus -Original message- > From:Erick Erickson <erickerick...@gmail.com> > Sent: Wednesday 4th October 2017 16:19 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very hig

Re: Very high number of deleted docs

2017-10-04 Thread Erick Erickson
rgeMerge after the periodic update cycle, but > i preferred Lucene to do it for me. > > Thanks, > Markus > > -Original message- >> From:Erick Erickson <erickerick...@gmail.com> >> Sent: Wednesday 4th October 2017 14:56 >> To: solr-user <solr-

RE: Very high number of deleted docs

2017-10-04 Thread Markus Jelsma
update cycle, but i preferred Lucene to do it for me. Thanks, Markus -Original message- > From:Erick Erickson <erickerick...@gmail.com> > Sent: Wednesday 4th October 2017 14:56 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very high number of deleted doc

RE: Very high number of deleted docs

2017-10-04 Thread Markus Jelsma
Ah thanks for that! -Original message- > From:Emir Arnautović <emir.arnauto...@sematext.com> > Sent: Wednesday 4th October 2017 15:03 > To: solr-user@lucene.apache.org > Subject: Re: Very high number of deleted docs > > Hi Markus, > It is passed but not expl

Re: Very high number of deleted docs

2017-10-04 Thread Erick Erickson
s >> >> -Original message- >>> From:Amrit Sarkar <sarkaramr...@gmail.com> >>> Sent: Wednesday 4th October 2017 14:42 >>> To: solr-user@lucene.apache.org >>> Subject: Re: Very high number of deleted docs >>> >>> Hi Markus, >&g

Re: Very high number of deleted docs

2017-10-04 Thread Emir Arnautović
onfig. > > Thanks, > Markus > > -Original message- >> From:Amrit Sarkar <sarkaramr...@gmail.com> >> Sent: Wednesday 4th October 2017 14:42 >> To: solr-user@lucene.apache.org >> Subject: Re: Very high number of deleted docs >> >> Hi Markus, >&g

Re: Very high number of deleted docs

2017-10-04 Thread Erick Erickson
ent: Wednesday 4th October 2017 14:42 >> To: solr-user@lucene.apache.org >> Subject: Re: Very high number of deleted docs >> >> Hi Markus, >> >> Emir already mentioned tuning *reclaimDeletesWeight which *affects segments >> about to merge priority. Optimising index

RE: Very high number of deleted docs

2017-10-04 Thread Markus Jelsma
-Original message- > From:Amrit Sarkar <sarkaramr...@gmail.com> > Sent: Wednesday 4th October 2017 14:42 > To: solr-user@lucene.apache.org > Subject: Re: Very high number of deleted docs > > Hi Markus, > > Emir already mentioned tuning *reclaimDeletesWeight w

RE: Very high number of deleted docs

2017-10-04 Thread Markus Jelsma
; To: solr-user@lucene.apache.org > Subject: Re: Very high number of deleted docs > > Hi Markus, > You can set reclaimDeletesWeight in merge settings to some higher value than > default (I think it is 2) to favor segments with deleted docs when merging. > > HTH, > Emir > -

Re: Very high number of deleted docs

2017-10-04 Thread Amrit Sarkar
Hi Markus, Emir already mentioned tuning *reclaimDeletesWeight which *affects segments about to merge priority. Optimising index time by time, preferably scheduling weekly / fortnight / ..., at low traffic period to never be in such odd position of 80% deleted docs in total index. Amrit Sarkar

Re: Very high number of deleted docs

2017-10-04 Thread Emir Arnautović
Hi Markus, You can set reclaimDeletesWeight in merge settings to some higher value than default (I think it is 2) to favor segments with deleted docs when merging. HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training -