Hi, I have a workload that depends on the GPU. I have only 1 GPU card. As per the documentation I have added the necessary configurations and can run the GPU workload in standalone REACTIVE mode with as many taskmanager instances as required.
I have set the number of task slots to 1 so that a raise in parallelism causes a new pod to be created. I can scale up the job just fine in this mode, however when I add autoscaling configurations to the FlinkDeployment manifest, scaling up doesn't work. This is because with the autoscaling manifest, there seems to be resource requests and limits are being automatically set to the pods for the gpu. This is not the case with the standalone mode which is why I guess scaling up doesn't cause any issues. So, what can I do to get the autoscaler working? I'm using Flink version 1.17.1 with PyFlink and Flink Kubernetes Operator version 1.5.0. Regards, Sunny -- SELISE Group Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland Munich: Tal 44, 80331 München, Germany Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab Emirates Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan Visit us: www.selisegroup.com <http://www.selisegroup.com> -- *Important Note: This e-mail and any attachment are confidential and may contain trade secrets and may well also be legally privileged or otherwise protected from disclosure. If you have received it in error, you are on notice of its status. Please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. If you are not the intended recipient please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.*