Hi, I've tinkered around a bit more and found that the problem is actually with Native mode vs Standalone mode. In the standalone mode, the pod definition doesn't get a Resource request for nvidia/gpu, whereas in the Native mode it does. I'll open another question since this isn't related to autoscaler at all. Thanks.
Regards, Sunny On Tue, Aug 1, 2023 at 3:34 PM Gyula Fóra <gyula.f...@gmail.com> wrote: > The autoscaler only works for FlinkDeployments in Native mode. You should > turn off the reactive scheduler mode as well because that's something > completely different. > After that you can check the autoscaler logs for more info. > > Gyula > > On Tue, Aug 1, 2023 at 10:33 AM Raihan Sunny via user < > user@flink.apache.org> wrote: > >> Hi, >> >> I have a workload that depends on the GPU. I have only 1 GPU card. As per >> the documentation I have added the necessary configurations and can run the >> GPU workload in standalone REACTIVE mode with as many taskmanager instances >> as required. >> >> I have set the number of task slots to 1 so that a raise in parallelism >> causes a new pod to be created. I can scale up the job just fine in this >> mode, however when I add autoscaling configurations to the FlinkDeployment >> manifest, scaling up doesn't work. This is because with the autoscaling >> manifest, there seems to be resource requests and limits are being >> automatically set to the pods for the gpu. This is not the case with the >> standalone mode which is why I guess scaling up doesn't cause any issues. >> >> So, what can I do to get the autoscaler working? I'm using Flink version >> 1.17.1 with PyFlink and Flink Kubernetes Operator version 1.5.0. >> >> >> Regards, >> Sunny >> >> [image: SELISE] >> >> SELISE Group >> Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland >> Munich: Tal 44, 80331 München, Germany >> Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab >> Emirates >> Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh >> Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, >> Bhutan >> >> Visit us: www.selisegroup.com >> >> *Important Note: This e-mail and any attachment are confidential and may >> contain trade secrets and may well also be legally privileged or otherwise >> protected from disclosure. If you have received it in error, you are on >> notice of its status. Please notify us immediately by reply e-mail and then >> delete this e-mail and any attachment from your system. If you are not the >> intended recipient please understand that you must not copy this e-mail or >> any attachment or disclose the contents to any other person. Thank you for >> your cooperation.* >> > -- SELISE Group Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland Munich: Tal 44, 80331 München, Germany Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab Emirates Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan Visit us: www.selisegroup.com <http://www.selisegroup.com> -- *Important Note: This e-mail and any attachment are confidential and may contain trade secrets and may well also be legally privileged or otherwise protected from disclosure. If you have received it in error, you are on notice of its status. Please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. If you are not the intended recipient please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.*