[PR] [FLINK-30571] Estimate scalability coefficient from past scaling history using linear regression [flink-kubernetes-operator]

via GitHub Wed, 02 Apr 2025 05:18:01 -0700


pchoudhury22 opened a new pull request, #966:
URL: https://github.com/apache/flink-kubernetes-operator/pull/966


   ### What is the purpose of the change
   
   Currently, target parallelism computation assumes perfect linear scaling. 
However, real-time workloads often exhibit nonlinear scalability due to factors 
like network overhead and coordination costs.
   
   This change introduces an observed scalability coefficient, estimated using 
weighted linear regression on past (parallelism, processing rate) data, to 
improve the accuracy of scaling decisions.
   
   ### Brief change log
   
   Implemented a dynamic scaling coefficient to compute target parallelism 
based on observed scalability. The system estimates the scalability coefficient 
using a weighted least squares linear regression approach, leveraging 
historical (parallelism, processing rate) data.
   The regression model minimizes the weighted sum of squared errors, where 
weights are assigned based on both parallelism and recency to prioritize recent 
observations. The baseline processing rate is computed using the smallest 
observed parallelism in the history. Model details:
   
   #### The Linear Model
   We define a linear relationship between **parallelism (P)** and **processing 
rate (R)**:
   ```math
   R_i = β * P_i * α
   ```
   where:
   - **R_i** = actual processing rate for the *i-th* data point
   - **P_i** = parallelism for the *i-th* data point
   - **β** = base factor (constant scale factor)
   - **α** = scaling coefficient to optimize
   
   #### Weighted Squared Error
   The loss function to minimize is the **weighted sum of squared errors 
(SSE)**:
   ```math
   Loss = Σ w_i * (R_i - R̂_i)^2
   ```
   Substituting \( R̂_i = (β α) P_i \):
   ```math
   Loss = Σ w_i * (R_i - β α P_i)^2
   ```
   where **w_i** is the weight for each data point.
   
   #### Minimizing the Error
   Expanding \( (R_i - β α P_i)^2 \):
   ```math
   (R_i - β α P_i)^2 = R_i^2 - 2β α P_i R_i + (β α P_i)^2
   ```
   Multiplying by **w_i** and summing over all data points:
   ```math
   Loss = Σ w_i * (R_i^2 - 2β α P_i R_i + β^2 α^2 P_i^2)
   ```
   
   #### Solving for α
   To minimize for **α**, taking the derivative and solving we get:
   ```math
   α = (Σ w_i P_i R_i) / (Σ w_i P_i^2 * β)
   ```
   
   ### Verifying this change
   
   New unit tests added to cover this
   
   ### Does this pull request potentially affect one of the following parts:
   
   Dependencies (does it add or upgrade a dependency): no
   The public API, i.e., is any changes to the CustomResourceDescriptors: no
   Core observer or reconciler logic that is regularly executed: no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [FLINK-30571] Estimate scalability coefficient from past scaling history using linear regression [flink-kubernetes-operator]

Reply via email to