[ 
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932486#action_12932486
 ] 

Shai Erera commented on LUCENE-2755:
------------------------------------

bq. Ideally only IW.merge should call it (and it becomes private),

I wouldn't make it private. If I remember correctly, the Parallel Index 
overrode that method to synchronize merges across all parallels.

bq. but if you're at your max merge count then CMS will stall you and the 
turnaround time easily becomes seconds, which is awful.

But Mike, if you hit your maxMergeCount with large merges, then you won't run 
tiny merges at all. It's only if you have room to run any merges, that this 
'pausing' actually helps. I trust you when you say you've observed that not 
pausing those merges hurt performance, but I wonder in real life, how often 
does that happen, and whether we should incorporate that in our code. If it's a 
rare case, then perhaps apps that hit it should use another MS which pauses its 
threads?

> Some improvements to CMS
> ------------------------
>
>                 Key: LUCENE-2755
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2755
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>
> While running optimize on a large index, I've noticed several things that got 
> me to read CMS code more carefully, and find these issues:
> * CMS may hold onto a merge if maxMergeCount is hit. That results in the 
> MergeThreads taking merges from the IndexWriter until they are exhausted, and 
> only then that blocked merge will run. I think it's unnecessary that that 
> merge will be blocked.
> * CMS sorts merges by segments size, doc-based and not bytes-based. Since the 
> default MP is LogByteSizeMP, and I hardly believe people care about doc-based 
> size segments anymore, I think we should switch the default impl. There are 
> two ways to make it extensible, if we want:
> ** Have an overridable member/method in CMS that you can extend and override 
> - easy.
> ** Have OneMerge be comparable and let the MP determine the order (e.g. by 
> bytes, docs, calibrate deletes etc.). Better, but will need to tap into 
> several places in the code, so more risky and complicated.
> On the go, I'd like to add some documentation to CMS - it's not very easy to 
> read and follow.
> I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to