[ 
https://issues.apache.org/jira/browse/LUCENE-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790976#action_12790976
 ] 

Michael McCandless commented on LUCENE-2164:
--------------------------------------------

{quote} 
This issue illustrates why the ram dir approach can be useful,
because small segment merges compete with large segment merges
for IO, which can spike the turnaround time. With a ram dir, the
small segments are held in RAM until they're large enough to be
placed onto disk. They can then be given the same IO priority as
the other merging segments which should result in consistent
reopen times.Make CMS smarter about thread priorities
{quote}

I'm still not convinced we should game the OS, here.

Ie, the small segments are likely still in the IO cache, so
we're probably not really competing for IO on the smaller merges.

I think it's when an immense merge is running that we're in trouble.
EG I see very long NRT reopen times when such a merge is running.

I've been wondering whether we need to take IO prioritization into our
own hands.  EG, instead of relying [only] on thread priorities for
CMS, somehow have the big merge pause until the small merge can
complete.  This would really be the best way to implement the "allow
at most 1 merge to run at once".  I guess we may be able to override
the mergeAbort to implement this...

I think a similar "emulate IO prioritization" may help when, eg, a
large merge/optimize is interfering with ongoing searches.  Yes, some
of the cost is because the merge is evicting IO cache, but for large
indexes that don't fit in RAM, there must be some cost to sharing the
disk heads, too...


> Make CMS smarter about thread priorities
> ----------------------------------------
>
>                 Key: LUCENE-2164
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2164
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2164.patch
>
>
> Spinoff from LUCENE-2161...
> The hard throttling CMS does (blocking the incoming thread that wants
> to launch a new merge) can be devastating when it strikes during NRT
> reopen.
> It can easily happen if a huge merge is off and running, but then a
> tiny merge is needed to clean up recently created segments due to
> frequent reopens.
> I think a small change to CMS, whereby it assigns a higher thread
> priority to tiny merges than big merges, should allow us to increase
> the max merge thread count again, and greatly reduce the chance that
> NRT's reopen would hit this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to