[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-----------------------------
    Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
    double rate = mbPerSec; 
    double secondsToPause = (bytes / 1024. / 1024.) / rate;
    long targetNS = lastNS + (long) (1000000000 * secondsToPause);
    long curPauseNS = targetNS - curNS;

    // We don't bother with thread pausing if the pause is smaller than 2 msec.
    if (curPauseNS <= MIN_PAUSE_NS) {
      // Set to curNS, not targetNS, to enforce the instant rate, not
      // the "averaged over all history" rate:
      lastNS = curNS;
      return -1;
    }
   ......
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (1000000000 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes (such as 0.28125mb) are not 
small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
    double rate = mbPerSec; 
    double secondsToPause = (bytes / 1024. / 1024.) / rate;
    long targetNS = lastNS + (long) (1000000000 * secondsToPause);
    long curPauseNS = targetNS - curNS;

    // We don't bother with thread pausing if the pause is smaller than 2 msec.
    if (curPauseNS <= MIN_PAUSE_NS) {
      // Set to curNS, not targetNS, to enforce the instant rate, not
      // the "averaged over all history" rate:
      lastNS = curNS;
      return -1;
    }
   ......
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (1000000000 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes (such as 0.28125mb) is not 
small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---------------------------------------------------
>
>                 Key: LUCENE-10448
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10448
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: 8.11.1
>            Reporter: kkewwei
>            Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>    
>     double rate = mbPerSec; 
>     double secondsToPause = (bytes / 1024. / 1024.) / rate;
>     long targetNS = lastNS + (long) (1000000000 * secondsToPause);
>     long curPauseNS = targetNS - curNS;
>     // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
>     if (curPauseNS <= MIN_PAUSE_NS) {
>       // Set to curNS, not targetNS, to enforce the instant rate, not
>       // the "averaged over all history" rate:
>       lastNS = curNS;
>       return -1;
>     }
>    ......
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (1000000000 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to