[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
kkewwei updated LUCENE-10448: ----------------------------- Description: We can see the code in *MergeRateLimiter*: {code:java} private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedException { double rate = mbPerSec; double secondsToPause = (bytes / 1024. / 1024.) / rate; long targetNS = lastNS + (long) (1000000000 * secondsToPause); long curPauseNS = targetNS - curNS; // We don't bother with thread pausing if the pause is smaller than 2 msec. if (curPauseNS <= MIN_PAUSE_NS) { // Set to curNS, not targetNS, to enforce the instant rate, not // the "averaged over all history" rate: lastNS = curNS; return -1; } ...... } {code} If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then the *maybePause* is called in 7:05 again, so the value of *targetNS=lastNS + (long) (1000000000 * secondsToPause)* must be smaller than *curNS*, no matter how big the bytes is, we will return -1 and ignore to pause. I count the total times(callTimes) calling *maybePause* and ignored pause times(ignorePauseTimes) and detail ignored bytes(detailBytes): {code:java} [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] {code} There are 857 times calling *maybePause*, including 25 times which is ignored to pause, we can see that the ignored detail bytes (such as 0.28125mb) are not small. As long as the interval between two *maybePause* calls is relatively long, the pause action that should be executed will not be executed. was: We can see the code in *MergeRateLimiter*: {code:java} private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedException { double rate = mbPerSec; double secondsToPause = (bytes / 1024. / 1024.) / rate; long targetNS = lastNS + (long) (1000000000 * secondsToPause); long curPauseNS = targetNS - curNS; // We don't bother with thread pausing if the pause is smaller than 2 msec. if (curPauseNS <= MIN_PAUSE_NS) { // Set to curNS, not targetNS, to enforce the instant rate, not // the "averaged over all history" rate: lastNS = curNS; return -1; } ...... } {code} If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then the *maybePause* is called in 7:05 again, so the value of *targetNS=lastNS + (long) (1000000000 * secondsToPause)* must be smaller than *curNS*, no matter how big the bytes is, we will return -1 and ignore to pause. I count the total times(callTimes) calling *maybePause* and ignored pause times(ignorePauseTimes) and detail ignored bytes(detailBytes): {code:java} [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] {code} There are 857 times calling *maybePause*, including 25 times which is ignored to pause, we can see that the ignored detail bytes (such as 0.28125mb) is not small. As long as the interval between two *maybePause* calls is relatively long, the pause action that should be executed will not be executed. > MergeRateLimiter doesn't always limit instant rate. > --------------------------------------------------- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other > Affects Versions: 8.11.1 > Reporter: kkewwei > Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (1000000000 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } > ...... > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (1000000000 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org