[ 
https://issues.apache.org/jira/browse/FLINK-31924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715824#comment-17715824
 ] 

Maximilian Michels commented on FLINK-31924:
--------------------------------------------

Could you clarify what is the issue here? The logs don't indicate an issue. The 
autoscaler will continue to run even after reaching the max parallelism. The 
max parallelism is per vertex. There may be other vertices which still get 
scaled.

> [Flink operator] Flink Autoscale - Limit the max number of scale ups
> --------------------------------------------------------------------
>
>                 Key: FLINK-31924
>                 URL: https://issues.apache.org/jira/browse/FLINK-31924
>             Project: Flink
>          Issue Type: Bug
>          Components: Autoscaler, Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.4.0
>            Reporter: Sriram Ganesh
>            Priority: Critical
>
> Found that Autoscale keeps happening even after reaching max-parallelism.
> {color:#172b4d}Flink version: 1.17{color}
> Source: Kafka
> Configuration:
>  
> {code:java}
> flinkConfiguration:
>     kubernetes.operator.job.autoscaler.enabled: "true"
>     kubernetes.operator.job.autoscaler.scaling.sources.enabled: "true"
>     kubernetes.operator.job.autoscaler.target.utilization: "0.6"
>     kubernetes.operator.job.autoscaler.target.utilization.boundary: "0.2"
>     kubernetes.operator.job.autoscaler.stabilization.interval: "1m"
>     kubernetes.operator.job.autoscaler.metrics.window: "3m"{code}
> Logs:
> {code:java}
> 2023-04-24 12:29:10,738 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][my-namespace/my-pod] Starting reconciliation2023-04-24 12:29:10,740 
> o.a.f.k.o.s.FlinkResourceContextFactory [INFO ][my-namespace/my-pod] Getting 
> service for my-job2023-04-24 12:29:10,740 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Observing job status2023-04-24 12:29:10,765 
> o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] Job status 
> changed from CREATED to RUNNING2023-04-24 12:29:10,870 o.a.f.k.o.l.AuditUtils 
>         [INFO ][my-namespace/my-pod] >>> Event  | Info    | JOBSTATUSCHANGED 
> | Job status changed from CREATED to RUNNING2023-04-24 12:29:10,938 
> o.a.f.k.o.l.AuditUtils         [INFO ][my-namespace/my-pod] >>> Status | Info 
>    | STABLE          | The resource deployment is considered to be stable and 
> won’t be rolled back2023-04-24 12:29:10,986 
> o.a.f.k.o.a.ScalingMetricCollector [INFO ][my-namespace/my-pod] Skipping 
> metric collection during stabilization period until 
> 2023-04-24T12:30:10.765Z2023-04-24 12:29:10,986 
> o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO ][my-namespace/my-pod] 
> Resource fully reconciled, nothing to do...2023-04-24 12:29:10,986 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] End of 
> reconciliation2023-04-24 12:29:25,991 o.a.f.k.o.c.FlinkDeploymentController 
> [INFO ][my-namespace/my-pod] Starting reconciliation2023-04-24 12:29:25,992 
> o.a.f.k.o.s.FlinkResourceContextFactory [INFO ][my-namespace/my-pod] Getting 
> service for my-job2023-04-24 12:29:25,992 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Observing job status2023-04-24 12:29:26,005 
> o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] Job status 
> (RUNNING) unchanged2023-04-24 12:29:26,053 o.a.f.k.o.a.ScalingMetricCollector 
> [INFO ][my-namespace/my-pod] Skipping metric collection during stabilization 
> period until 2023-04-24T12:30:10.765Z2023-04-24 12:29:26,054 
> o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO ][my-namespace/my-pod] 
> Resource fully reconciled, nothing to do...2023-04-24 12:29:26,054 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] End of 
> reconciliation2023-04-24 12:29:41,059 o.a.f.k.o.c.FlinkDeploymentController 
> [INFO ][my-namespace/my-pod] Starting reconciliation2023-04-24 12:29:41,060 
> o.a.f.k.o.s.FlinkResourceContextFactory [INFO ][my-namespace/my-pod] Getting 
> service for my-job2023-04-24 12:29:41,061 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Observing job status2023-04-24 12:29:41,075 
> o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] Job status 
> (RUNNING) unchanged2023-04-24 12:29:41,116 o.a.f.k.o.a.ScalingMetricCollector 
> [INFO ][my-namespace/my-pod] Skipping metric collection during stabilization 
> period until 2023-04-24T12:30:10.765Z2023-04-24 12:29:41,116 
> o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO ][my-namespace/my-pod] 
> Resource fully reconciled, nothing to do...2023-04-24 12:29:41,116 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] End of 
> reconciliation2023-04-24 12:29:56,121 o.a.f.k.o.c.FlinkDeploymentController 
> [INFO ][my-namespace/my-pod] Starting reconciliation2023-04-24 12:29:56,122 
> o.a.f.k.o.s.FlinkResourceContextFactory [INFO ][my-namespace/my-pod] Getting 
> service for my-job2023-04-24 12:29:56,122 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Observing job status2023-04-24 12:29:56,134 
> o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] Job status 
> (RUNNING) unchanged2023-04-24 12:29:56,178 o.a.f.k.o.a.ScalingMetricCollector 
> [INFO ][my-namespace/my-pod] Skipping metric collection during stabilization 
> period until 2023-04-24T12:30:10.765Z2023-04-24 12:29:56,179 
> o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO ][my-namespace/my-pod] 
> Resource fully reconciled, nothing to do...2023-04-24 12:29:56,179 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] End of 
> reconciliation2023-04-24 12:30:11,183 o.a.f.k.o.c.FlinkDeploymentController 
> [INFO ][my-namespace/my-pod] Starting reconciliation2023-04-24 12:30:11,184 
> o.a.f.k.o.s.FlinkResourceContextFactory [INFO ][my-namespace/my-pod] Getting 
> service for my-job2023-04-24 12:30:11,184 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Observing job status2023-04-24 12:30:11,193 
> o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] Job status 
> (RUNNING) unchanged2023-04-24 12:30:11,367 o.a.f.k.o.a.m.ScalingMetrics   
> [ERROR][my-namespace/my-pod] Cannot compute source target data rate without 
> numRecordsInPerSecond and pendingRecords (lag) metric for 
> e5a72f353fc1e6bbf3bd96a41384998c.2023-04-24 12:30:11,370 
> o.a.f.k.o.a.ScalingMetricCollector [INFO ][my-namespace/my-pod] Waiting until 
> 2023-04-24T12:33:10.765Z so the initial metric window is full before starting 
> scaling2023-04-24 12:30:11,370 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler 
> [INFO ][my-namespace/my-pod] Resource fully reconciled, nothing to 
> do...2023-04-24 12:30:11,370 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][my-namespace/my-pod] End of reconciliation2023-04-24 12:30:26,374 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] Starting 
> reconciliation2023-04-24 12:30:26,375 o.a.f.k.o.s.FlinkResourceContextFactory 
> [INFO ][my-namespace/my-pod] Getting service for my-job2023-04-24 
> 12:30:26,376 o.a.f.k.o.o.JobStatusObserver  [INFO ][my-namespace/my-pod] 
> Observing job status2023-04-24 12:30:26,385 o.a.f.k.o.o.JobStatusObserver  
> [INFO ][my-namespace/my-pod] Job status (RUNNING) unchanged2023-04-24 
> 12:30:26,542 o.a.f.k.o.a.m.ScalingMetrics   [ERROR][my-namespace/my-pod] 
> Cannot compute source target data rate without numRecordsInPerSecond and 
> pendingRecords (lag) metric for e5a72f353fc1e6bbf3bd96a41384998c.2023-04-24 
> 12:30:26,543 o.a.f.k.o.a.ScalingMetricCollector [INFO ][my-namespace/my-pod] 
> Waiting until 2023-04-24T12:33:10.765Z so the initial metric window is full 
> before starting scaling2023-04-24 12:30:26,543 
> o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO ][my-namespace/my-pod] 
> Resource fully reconciled, nothing to do...2023-04-24 12:30:26,544 
> o.a.f.k.o.c.FlinkDeploymentController [INFO ][my-namespace/my-pod] End of 
> reconciliation{code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to