OK Thanks Shawn,

 I went with this because 10 wasn't working for us and it looks like my index 
is staying under 20 GB now with numDocs : 16897524 and maxDoc : 19048053

        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
          <int name="maxMergeAtOnce">5</int>
          <int name="segmentsPerTier">5</int>
          <int name="maxMergeAtOnceExplicit">15</int>
          <double name="maxMergedSegmentMB">6144.0</double>
          <double name="reclaimDeletesWeight">6.0</double>
        </mergePolicy>



-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Wednesday, July 10, 2013 5:34 PM
To: solr-user@lucene.apache.org
Subject: Re: expunging deletes

On 7/10/2013 5:58 PM, Petersen, Robert wrote:
> Using solr 3.6.1 and the following settings, I am trying to run without 
> optimizes.  I used to optimize nightly, but sometimes the optimize took a 
> very long time to complete and slowed down our indexing.  We are continuously 
> indexing our new or changed data all day and night.  After a few days running 
> without an optimize, the index size has nearly doubled and maxdocs is nearly 
> twice the size of numdocs.  I understand deletes should be expunged on 
> merges, but even after trying lots of different settings for our merge policy 
> it seems this growth is somewhat unbounded.  I have tried sending an optimize 
> with numSegments = 2 which is a lot lighter weight then a regular optimize 
> and that does bring the number down but not by too much.  Does anyone have 
> any ideas for better settings for my merge policy that would help?  Here is 
> my current index snapshot too:

Your merge settings are the equivalent of the old mergeFactor set to 35, and 
based on the fact that you have the Explicit set to 105, I'm guessing your 
settings originally came from something I posted - these are the numbers that I 
use.  These settings can result in a very large number of segments on your disk.

Because you index a lot (and probably reindex existing documents often), I can 
understand why you have high merge settings, but if you want to eliminate 
optimizes, you'll need to go lower.  The default merge setting of 10 (with an 
Explicit value of 30) is probably a good starting point, but you might need to 
go even smaller.

On Solr 3.6, an optimize probably cannot take place at the same time as index 
updates -- the optimize would probably delay updates until after it's finished. 
 I remember running into problems on Solr 3.x, so I set up my indexing program 
to stop updates while the index was optimizing.

Solr 4.x should lift any restriction where optimizes and updates can't happen 
at the same time.

With an index size of 25GB, a six-drive RAID10 should be able to optimize in 
10-15 minutes, but if your I/O system is single disk, RAID1, RAID5, or RAID6, 
the write performance may cause this to take longer.
If you went with SSD, optimizes would happen VERY fast.

Thanks,
Shawn



Reply via email to