On 5/26/2015 6:29 AM, Upayavira wrote:
> Are you saying that the reason you are optimising is because you have
> been doing it for years? If this is the only reason, you should stop
> doing it immediately. 
>
> The one scenario in which optimisation still makes some sense is when
> you reindex every night and optimise straight after. This will leave you
> with a single segment which will search faster.
>
> However, if you are doing a lot of indexing, especially with
> deletes/updates, you will have merged your content into a single segment
> which will later need to be merged. That merge will be costly as it will
> involve copying the entire content of your large segment, which will
> impact performance.
>
> Before Solr 3.6, Optimisation was necessary and recommended. At that
> point (or a little before) the TieredMergePolicy became the default, and
> this made optimisation generally unnecessary.

In general, I concur with this advice about optimizing.  Historically,
optimize was done for increased performance.  In older versions, an
unoptimized index performed *MUCH* worse than an index with a single
segment.  This is no longer the case today, mostly due to so many Lucene
features working on a per-segment basis.  A single segment does perform
faster, but the difference is much smaller than it used to be.

A full optimize on a large index requires a LOT of CPU and I/O resources
-- while the optimize is underway, performance is not very good.

There are,however, still times when running optimize is appropriate:

1) The index is mostly static, not receiving very frequent updates.
2) There is a large percentage of deleted documents in the index.

With modern Lucene/Solr and these use cases, the reasons for optimizing
are still performance-related, but the only time you should do an
optimize is when the benefit outweighs the cost.

For the 1) use case, the index will likely remain mostly-optimized for a
long period of time after the optimize is done, so the resources
required for the optimize are worth spending.

For the 2) use case, optimizing will reduce the size of the index
significantly, so general performance gets better.  That makes the cost
worthwhile.

Thanks,
Shawn

Reply via email to