[ 
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932124#action_12932124
 ] 

Earwin Burrfoot commented on LUCENE-2755:
-----------------------------------------

Whatever solution for block-on-add you employ, I think it is important to 
implement it as an Executor. I think, people can benefit from threading policy 
being pluggable.

I'm not against sorting merges, it's so simple, even if useless. Though maybe 
it's better to use Comparator, so you can redefine the order? Pausing large 
merges is another issue - that's a freakload of complexity for zero gain.

Another issue to ponder - what about slightly uncluttering IW <-> MS 
interaction?
We drop IW.getNextMerge, MS.merge(IW), and replace them with 
MS.scheduleMerge(MP.OM), so instead of IW asking MS to pull all merges from 
itself, it simply pushes them.
Also, let's kill this weeeird IW.mergeInit that is called from CMS, but not SMS 
:)


But oh, well. With introduction of executors, and SMS being folded as a special 
case of CMS, we might as well drop MS completely and move what little code is 
left straight to IW, which will now accept an executor.

> Some improvements to CMS
> ------------------------
>
>                 Key: LUCENE-2755
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2755
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>
> While running optimize on a large index, I've noticed several things that got 
> me to read CMS code more carefully, and find these issues:
> * CMS may hold onto a merge if maxMergeCount is hit. That results in the 
> MergeThreads taking merges from the IndexWriter until they are exhausted, and 
> only then that blocked merge will run. I think it's unnecessary that that 
> merge will be blocked.
> * CMS sorts merges by segments size, doc-based and not bytes-based. Since the 
> default MP is LogByteSizeMP, and I hardly believe people care about doc-based 
> size segments anymore, I think we should switch the default impl. There are 
> two ways to make it extensible, if we want:
> ** Have an overridable member/method in CMS that you can extend and override 
> - easy.
> ** Have OneMerge be comparable and let the MP determine the order (e.g. by 
> bytes, docs, calibrate deletes etc.). Better, but will need to tap into 
> several places in the code, so more risky and complicated.
> On the go, I'd like to add some documentation to CMS - it's not very easy to 
> read and follow.
> I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to