Hi! It’s currently not possible to run the operator in parallel by simply adding more replicas. However there are different things you can do to scale both vertically and horizontally.
First of all you can run multiple operators each watching different set of namespaces to partition the load. The operator also supports watching CRs with a certain label selector which would allow you to horizontally partition the load with custom CR labels if necessary. You can also try increasing the reconciler parallelism of the operator to use more threads and reconcile more CRs in parallel. If you increase this you might need to increase the heap size as well. Let me know if this helps! Gyula On Thu, 27 Apr 2023 at 09:15, Talat Uyarer via user <user@flink.apache.org> wrote: > Hi All, > > We are using Flink Kubernetes Operator on our production. We have 3k+ jobs > in standalone mode. But after 2.5k jobs operator getting slow. Now when we > submit a job it takes 10+ minutes to the job runs. Does anyone use similar > scale or more job ? > > Now we run as a single pod. Does operator support multi pods if i increase > replicas ? > > Do you have any suggestions where should i start looking to debug ? > > Thanks >