Hi Leon,

Adaptive scheduler alone cannot autoscale a Flink job. It simply adjusts the 
parallelism of a job based on available slots [1]. To autoscale a job, we 
further need a policy to suggest the recommended resources for the job and a 
mechanism to adjust the allocated resources of the job (aka. available slots). 
For K8s standalone application mode, we can use reactive mode coupled with K8s 
HPA, where HPA collects pod metrics and autoscales the number of TMs, and 
adaptive scheduler rescales job according to the available slots. For YARN 
application mode, reactive mode is not available. However, in the coming 1.18 
release, we can declare the desired resources through REST API to adjust the 
allocated resources of the job via FLIP-291 [2], but you still need a policy to 
suggest the recommended resources for the job and call the API, which you can 
refer to the autoscaler implemention in Flink K8s operator.

[1] Elastic Scaling | Apache 
Flink<https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#adaptive-scheduler>
[2] FLIP-291: Externalized Declarative Resource Management - Apache Flink - 
Apache Software 
Foundation<https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management>
[3] Autoscaler | Apache Flink Kubernetes 
Operator<https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/custom-resource/autoscaler/>

Best,
Zhanghao Chen
________________________________
发件人: Leon Xu <l...@attentivemobile.com>
发送时间: 2023年6月27日 13:41
收件人: user <user@flink.apache.org>
主题: Questions regarding adaptive scheduler with YARN and application mode

Hi Flink users,

I am trying to use Adaptive Scheduler to auto scale our Flink streaming jobs 
(NOT batch job). Our jobs are running on YARN with application mode. There 
isn't much doc around how adaptive scheduler works. So I have some questions:


  1.  How does Adaptive Scheduler work with YARN/Application mode? If the 
scheduler decides to request more tasks will it trigger the request to YARN 
while the job is already running

  2.  What's the evaluation criteria to trigger a scale-up ? Is it possible to 
manually trigger a scale-up for testing purposes?

Thanks

Reply via email to