On Sep 21, 2009, at 9:35 PM, John Wang wrote:
Jason:
You are missing the point.
The idea is to avoid merging of large segments. The point of
this MergePolicy is to balance segment merges across the index. The
aim is not to have 1 large segment, it is to have n segments with
balanced sizes.
When the large segment is out of the IO cache, replacing it is
very costly. What we have done is to split the cost over time by
having more frequent but faster merges.
Yeah, I have seen this in action several times as well. See also some
discussion at: http://www.lucidimagination.com/search/document/bd53b0431f7eada5/concurrentmergescheduler_and_mergepolicy_question
.