[ 
https://issues.apache.org/jira/browse/LUCENE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-8162:
------------------------------------
    Description: 
As discussed in a [recent mailing list 
thread|[http://markmail.org/message/re3ifmq2664bqfjk]] and observed in a 
project using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to 
throttle the aggressiveness of (Tiered)MergePolicy when commit rate is high.

In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
implemented [1].

That MP didn't merge in case the number of segments is below a certain 
threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high (e.g. 
above 1000 doc / sec , 5MB / sec).

In such impl, the commit rate thresholds adapt to average commit rate by means 
of single exponential smoothing.

The results in that specific case looked encouraging as it brought a 5% perf 
improvement in querying and ~10% reduced IO. However Oak has some specifics 
which might not fit in other scenarios. Anyway it could be interesting to see 
how this behaves in plain Lucene scenario.

[1] : 
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]

  was:
As discussed in a [recent mailing list 
thread|[http://markmail.org/message/re3ifmq2664bqfjk],]] and observed in a 
project using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to 
throttle the aggressiveness of (Tiered)MergePolicy when commit rate is high.

In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
implemented [1].

That MP didn't merge in case the number of segments is below a certain 
threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high (e.g. 
above 1000 doc / sec , 5MB / sec).

In such impl, the commit rate thresholds adapt to average commit rate by means 
of single exponential smoothing.

The results in that specific case looked encouraging as it brought a 5% perf 
improvement in querying and ~10% reduced IO. However Oak has some specifics 
which might not fit in other scenarios. Anyway it could be interesting to see 
how this behaves in plain Lucene scenario.

[1] : 
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]


> Make it possible to throttle (Tiered)MergePolicy when commit rate is high
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-8162
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8162
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Tommaso Teofili
>            Priority: Major
>             Fix For: trunk
>
>
> As discussed in a [recent mailing list 
> thread|[http://markmail.org/message/re3ifmq2664bqfjk]] and observed in a 
> project using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to 
> throttle the aggressiveness of (Tiered)MergePolicy when commit rate is high.
> In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
> implemented [1].
> That MP didn't merge in case the number of segments is below a certain 
> threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high 
> (e.g. above 1000 doc / sec , 5MB / sec).
> In such impl, the commit rate thresholds adapt to average commit rate by 
> means of single exponential smoothing.
> The results in that specific case looked encouraging as it brought a 5% perf 
> improvement in querying and ~10% reduced IO. However Oak has some specifics 
> which might not fit in other scenarios. Anyway it could be interesting to see 
> how this behaves in plain Lucene scenario.
> [1] : 
> [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to