[ 
https://issues.apache.org/jira/browse/LUCENE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492469#comment-16492469
 ] 

Tommaso Teofili edited comment on LUCENE-8162 at 5/28/18 10:05 AM:
-------------------------------------------------------------------

{quote}but many users index at full speed for a long time and suppressing 
merges in that case is dangerous
{quote}
yes, that might make search performance degrade. To mitigate that the proposed 
MP has a maximum number of segments allowed for throttling. So for example if 
the throttling algorithm makes the number of segments go beyond a configurable 
threshold (e.g. 20), the throttling algorithm doesn't kick in in the next merge 
and until the number of segments gets back beyond the threshold (by using 
standard TMP merge algorithm).

I have been trying to use [https://github.com/mikemccand/luceneutil] to make 
some benchmarks. However it seems the tool only creates one index per 
benchmark, if anyone has suggestions about how to benchmark both indexing (time 
and space) and querying performance that'd be great. 


was (Author: teofili):
{quote}but many users index at full speed for a long time and suppressing 
merges in that case is dangerous
{quote}
yes, that might make search performance degrade. To mitigate that the proposed 
MP has a maximum number of segments allowed for throttling. So for example if 
the throttling algorithm makes the number of segments go beyond a configurable 
threshold (e.g. 20), the throttling algorithm doesn't kick in in the next merge 
and until the number of segments gets back beyond the threshold.

I have been trying to use [https://github.com/mikemccand/luceneutil] to make 
some benchmarks. However it seems the tool only creates one index per 
benchmark. 

> Make it possible to throttle (Tiered)MergePolicy when commit rate is high
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-8162
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8162
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Tommaso Teofili
>            Priority: Major
>             Fix For: trunk
>
>         Attachments: LUCENE-8162.0.patch
>
>
> As discussed in a recent mailing list thread [1] and observed in a project 
> using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to throttle 
> the aggressiveness of (Tiered)MergePolicy when commit rate is high.
> In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
> implemented [2].
> That MP doesn't merge in case the number of segments is below a certain 
> threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high 
> (e.g. above 1000 doc / sec , 5MB / sec).
> In such impl, the commit rate thresholds adapt to average commit rate by 
> means of single exponential smoothing.
> The results in that specific case looked encouraging as it brought a 5% perf 
> improvement in querying and ~10% reduced IO. However Oak has some specifics 
> which might not fit in other scenarios. Anyway it could be interesting to see 
> how this behaves in plain Lucene scenario.
> [1] : [http://markmail.org/message/re3ifmq2664bqfjk]
> [2] : 
> [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to