Hi Upayavira and Erick, There are two things we are talking about here.
First: Why am I optimizing? If I don’t our SEARCH (NOT INDEXING) performance is 100% worst. The problem lies in the number of total segments. We have to have max segments 1 or 2. I have done intensive performance related tests around number of segments, merge factor or changing the Merge policy. Second: Solr does not perform better for me without an optimize. So now that I have to optimize the second issue is updating concurrently during an optimize. If I update when an optimize is happening the optimize takes 5 times as long as the normal optimize. So is there any way other than creating a postOptimize hook and writing the status in a file and somehow making it available to the indexer. All of this just sounds traumatic :) Thanks Summer > On Jun 29, 2015, at 5:40 AM, Erick Erickson <erickerick...@gmail.com> wrote: > > Steven: > > Yes, but.... > > First, here's Mike McCandles' excellent blog on segment merging: > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > > I think the third animation is the TieredMergePolicy. In short, yes an > optimize will reclaim disk space. But as you update, this is done for > you anyway. About the only time optimizing is at all beneficial is > when you have a relatively static index. If you're continually > updating documents, and by that I mean replacing some existing > documents, then you'll immediately start generating "holes" in your > index. > > And if you _do_ optimize, you wind up with a huge segment. And since > the default policy tries to merge segments of roughly the same size, > it accumulates deletes for quite a while before they merged away. > > And if you don't update existing docs or delete docs, then there's no > wasted space anyway. > > Summer: > > First off, why do you care about not updating during optimizing? > There's no good reason you have to worry about that, you can freely > update while optimizing. > > But frankly I have to agree with Upayavira that on the face of it > you're doing a lot of extra work. See above, but you optimize while > indexing, so immediately you're rather defeating the purpose. > Personally I'd only optimize relatively static indexes and, by > definition, you're index isn't static since the second process is just > waiting to modify it. > > Best, > Erick > > On Mon, Jun 29, 2015 at 8:15 AM, Steven White <swhite4...@gmail.com> wrote: >> Hi Upayavira, >> >> This is news to me that we should not optimize and index. >> >> What about disk space saving, isn't optimization to reclaim disk space or >> is Solr somehow does that? Where can I read more about this? >> >> I'm on Solr 5.1.0 (may switch to 5.2.1) >> >> Thanks >> >> Steve >> >> On Mon, Jun 29, 2015 at 4:16 AM, Upayavira <u...@odoko.co.uk> wrote: >> >>> I'm afraid I don't understand. You're saying that optimising is causing >>> performance issues? >>> >>> Simple solution: DO NOT OPTIMIZE! >>> >>> Optimisation is very badly named. What it does is squashes all segments >>> in your index into one segment, removing all deleted documents. It is >>> good to get rid of deletes - in that sense the index is "optimized". >>> However, future merges become very expensive. The best way to handle >>> this topic is to leave it to Lucene/Solr to do it for you. Pretend the >>> "optimize" option never existed. >>> >>> This is, of course, assuming you are using something like Solr 3.5+. >>> >>> Upayavira >>> >>> On Mon, Jun 29, 2015, at 08:08 AM, Summer Shire wrote: >>>> >>>> Have to cause of performance issues. >>>> Just want to know if there is a way to tap into the status. >>>> >>>>> On Jun 28, 2015, at 11:37 PM, Upayavira <u...@odoko.co.uk> wrote: >>>>> >>>>> Bigger question, why are you optimizing? Since 3.6 or so, it generally >>>>> hasn't been requires, even, is a bad thing. >>>>> >>>>> Upayavira >>>>> >>>>>> On Sun, Jun 28, 2015, at 09:37 PM, Summer Shire wrote: >>>>>> Hi All, >>>>>> >>>>>> I have two indexers (Independent processes ) writing to a common solr >>>>>> core. >>>>>> If One indexer process issued an optimize on the core >>>>>> I want the second indexer to wait adding docs until the optimize has >>>>>> finished. >>>>>> >>>>>> Are there ways I can do this programmatically? >>>>>> pinging the core when the optimize is happening is returning OK >>> because >>>>>> technically >>>>>> solr allows you to update when an optimize is happening. >>>>>> >>>>>> any suggestions ? >>>>>> >>>>>> thanks, >>>>>> Summer >>>