[ 
https://issues.apache.org/jira/browse/FLINK-31976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tan Kim updated FLINK-31976:
----------------------------
    Description: 
The determination of whether it is an inefficient scale-up is calculated as 
follows
{code:java}
double lastProcRate = 
lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage();
double lastExpectedProcRate =
lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent();
var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
double expectedIncrease = lastExpectedProcRate - lastProcRate;
double actualIncrease = currentProcRate - lastProcRate;

boolean withinEffectiveThreshold =
(actualIncrease / expectedIncrease)
>= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}
Because the expectedIncrease value references the last scaling history, it will 
not change unless there is an additional scale-up, only the actualIncrease 
value will change.
The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
The calculation of TRUE_PROCESSING_RATE is as follows
trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()

For example, let's say you've been marked as an inefficient scale-up, but the 
LAG continues to build up.
You need to scale up to eliminate the growing LAG, but because you're marked as 
an inefficient scale-up, it won't happen.
To unmark a scaleup as inefficient, the following conditions must be met: 
actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default 0.1)

Here, expectedIncrease is a constant with lastSummary, so the value of 
actualIncrease must increase.
However, the actualIncrease value is proportional to busyTimeMultiplier and 
numRecordsInPerSecond, and these two values will converge to a certain value if 
no scaling occurs.
Therefore, the value of actualIncrease will also converge.
If this value fails to cross a threshold, no further scaling up is possible, 
even if the lag continues to build up.

  was:
The determination of whether it is an inefficient scale-up is calculated as 
follows


{code:java}
double lastProcRate = 
lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage(); // 
22569.315633422066
double lastExpectedProcRate =
lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent(); // 37340.0
var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
double expectedIncrease = lastExpectedProcRate - lastProcRate;
double actualIncrease = currentProcRate - lastProcRate;

boolean withinEffectiveThreshold =
(actualIncrease / expectedIncrease)
>= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}

Because the expectedIncrease value references the last scaling history, it will 
not change unless there is an additional scale-up, only the actualIncrease 
value will change.
The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
The calculation of TRUE_PROCESSING_RATE is as follows
trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()

For example, let's say you've been marked as an inefficient scale-up, but the 
LAG continues to build up.
You need to scale up to eliminate the growing LAG, but because you're marked as 
an inefficient scale-up, it won't happen.
To unmark a scaleup as inefficient, the following conditions must be met: 
actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default 0.1)

Here, expectedIncrease is a constant with lastSummary, so the value of 
actualIncrease must increase.
However, the actualIncrease value is proportional to busyTimeMultiplier and 
numRecordsInPerSecond, and these two values will converge to a certain value if 
no scaling occurs.
Therefore, the value of actualIncrease will also converge.
If this value fails to cross a threshold, no further scaling up is possible, 
even if the lag continues to build up.


> Once marked as an inefficient scale-up, further scaling may not happen forever
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-31976
>                 URL: https://issues.apache.org/jira/browse/FLINK-31976
>             Project: Flink
>          Issue Type: Improvement
>          Components: Autoscaler
>    Affects Versions: 1.17.0
>            Reporter: Tan Kim
>            Priority: Major
>
> The determination of whether it is an inefficient scale-up is calculated as 
> follows
> {code:java}
> double lastProcRate = 
> lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage();
> double lastExpectedProcRate =
> lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent();
> var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
> double expectedIncrease = lastExpectedProcRate - lastProcRate;
> double actualIncrease = currentProcRate - lastProcRate;
> boolean withinEffectiveThreshold =
> (actualIncrease / expectedIncrease)
> >= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}
> Because the expectedIncrease value references the last scaling history, it 
> will not change unless there is an additional scale-up, only the 
> actualIncrease value will change.
> The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
> The calculation of TRUE_PROCESSING_RATE is as follows
> trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()
> For example, let's say you've been marked as an inefficient scale-up, but the 
> LAG continues to build up.
> You need to scale up to eliminate the growing LAG, but because you're marked 
> as an inefficient scale-up, it won't happen.
> To unmark a scaleup as inefficient, the following conditions must be met: 
> actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default 
> 0.1)
> Here, expectedIncrease is a constant with lastSummary, so the value of 
> actualIncrease must increase.
> However, the actualIncrease value is proportional to busyTimeMultiplier and 
> numRecordsInPerSecond, and these two values will converge to a certain value 
> if no scaling occurs.
> Therefore, the value of actualIncrease will also converge.
> If this value fails to cross a threshold, no further scaling up is possible, 
> even if the lag continues to build up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to