[ https://issues.apache.org/jira/browse/SPARK-33737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stavros Kontopoulos updated SPARK-33737: ---------------------------------------- Description: Kubernetes backend uses Fabric8 client and a [watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42] to monitor the K8s Api server for pod changes. Every watcher keeps a websocket connection open and has no caching mechanism at that part. Caching at the Spark K8s resource manager exists in other areas where we are hitting the Api Server for Pod CRUD ops like [here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49]. In an env where a lot of connections are kept due to large scale jobs this could be problematic and impose a lot of load against the API server. A lot of long running jobs should not create pod changes eg. Streaming jobs to justify a continuous watching mechanism. Latest Frabric8 client versions have implemented a SharedInformer API+Lister, an example can be found [here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37]. This new API follows the implementation of the official java K8s client and the go counterpart and it is backed up by a caching mechanism which is re-synced after a configurable period to avoid hitting the API server all the time. There is also a lister that keeps track of current status of resources. Using such a mechanism is common place when implementing a K8s controller. The suggestion is to update to v4.13.0 the client (has all updates in wrt that API) and use the informer+lister API where applicable. I think the lister could also replace part of the snapshotting/notification mechanism. /cc [~dongjoon] [~eje] [~holden] WDYTH? was: Kubernetes backend uses Fabric8 client and a [watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42] to monitor the K8s Api server for pod changes. Every watcher keeps a websocket connection open and has no caching mechanism at that part. Caching at the K8s resource manager exists in other areas where we are hitting the Api Server for Pod CRUD ops like [here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49]. In an env where a lot of connections are kept due to large scale jobs this could be problematic and impose a lot of load against the API server. A lot of long running jobs should not create pod changes eg. Streaming jobs to justify a continuous watching mechanism. Latest Frabric8 client versions have implemented a SharedInformer API+Lister, an example can be found [here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37]. This new API follows the implementation of the official java K8s client and the go counterpart and it is backed up by a caching mechanism which is re-synced after a configurable period to avoid hitting the API server all the time. There is also a lister that keeps track of current status of resources. Using such a mechanism is common place when implementing a K8s controller. The suggestion is to update to v4.13.0 the client (has all updates in wrt that API) and use the informer+lister API where applicable. I think the lister could also replace part of the snapshotting/notification mechanism. /cc [~dongjoon] [~eje] [~holden] WDYTH? > Use an Informer+Lister API in the ExecutorPodWatcher > ---------------------------------------------------- > > Key: SPARK-33737 > URL: https://issues.apache.org/jira/browse/SPARK-33737 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 3.0.2 > Reporter: Stavros Kontopoulos > Priority: Major > > Kubernetes backend uses Fabric8 client and a > [watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42] > to monitor the K8s Api server for pod changes. Every watcher keeps a > websocket connection open and has no caching mechanism at that part. Caching > at the Spark K8s resource manager exists in other areas where we are hitting > the Api Server for Pod CRUD ops like > [here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49]. > In an env where a lot of connections are kept due to large scale jobs this > could be problematic and impose a lot of load against the API server. A lot > of long running jobs should not create pod changes eg. Streaming jobs to > justify a continuous watching mechanism. > Latest Frabric8 client versions have implemented a SharedInformer API+Lister, > an example can be found > [here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37]. > This new API follows the implementation of the official java K8s client and > the go counterpart and it is backed up by a caching mechanism which is > re-synced after a configurable period to avoid hitting the API server all the > time. There is also a lister that keeps track of current status of resources. > Using such a mechanism is common place when implementing a K8s controller. > The suggestion is to update to v4.13.0 the client (has all updates in wrt > that API) and use the informer+lister API where applicable. > I think the lister could also replace part of the snapshotting/notification > mechanism. > /cc [~dongjoon] [~eje] [~holden] WDYTH? > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org