[ https://issues.apache.org/jira/browse/LUCENE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492469#comment-16492469 ]
Tommaso Teofili edited comment on LUCENE-8162 at 5/28/18 10:05 AM: ------------------------------------------------------------------- {quote}but many users index at full speed for a long time and suppressing merges in that case is dangerous {quote} yes, that might make search performance degrade. To mitigate that the proposed MP has a maximum number of segments allowed for throttling. So for example if the throttling algorithm makes the number of segments go beyond a configurable threshold (e.g. 20), the throttling algorithm doesn't kick in in the next merge and until the number of segments gets back beyond the threshold (by using standard TMP merge algorithm). I have been trying to use [https://github.com/mikemccand/luceneutil] to make some benchmarks. However it seems the tool only creates one index per benchmark, if anyone has suggestions about how to benchmark both indexing (time and space) and querying performance that'd be great. was (Author: teofili): {quote}but many users index at full speed for a long time and suppressing merges in that case is dangerous {quote} yes, that might make search performance degrade. To mitigate that the proposed MP has a maximum number of segments allowed for throttling. So for example if the throttling algorithm makes the number of segments go beyond a configurable threshold (e.g. 20), the throttling algorithm doesn't kick in in the next merge and until the number of segments gets back beyond the threshold. I have been trying to use [https://github.com/mikemccand/luceneutil] to make some benchmarks. However it seems the tool only creates one index per benchmark. > Make it possible to throttle (Tiered)MergePolicy when commit rate is high > ------------------------------------------------------------------------- > > Key: LUCENE-8162 > URL: https://issues.apache.org/jira/browse/LUCENE-8162 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index > Reporter: Tommaso Teofili > Priority: Major > Fix For: trunk > > Attachments: LUCENE-8162.0.patch > > > As discussed in a recent mailing list thread [1] and observed in a project > using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to throttle > the aggressiveness of (Tiered)MergePolicy when commit rate is high. > In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was > implemented [2]. > That MP doesn't merge in case the number of segments is below a certain > threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high > (e.g. above 1000 doc / sec , 5MB / sec). > In such impl, the commit rate thresholds adapt to average commit rate by > means of single exponential smoothing. > The results in that specific case looked encouraging as it brought a 5% perf > improvement in querying and ~10% reduced IO. However Oak has some specifics > which might not fit in other scenarios. Anyway it could be interesting to see > how this behaves in plain Lucene scenario. > [1] : [http://markmail.org/message/re3ifmq2664bqfjk] > [2] : > [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org