Stavros Kontopoulos created SPARK-33737: -------------------------------------------
Summary: Use an Informer+Lister API in the ExecutorPodWatcher Key: SPARK-33737 URL: https://issues.apache.org/jira/browse/SPARK-33737 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.0.2 Reporter: Stavros Kontopoulos Kubernetes backend uses Fabric8 client and a [watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42] to monitor the K8s Api server for pod changes. Every watcher keeps a websocket connection open and has no caching mechanism at that part. Caching exists in other areas where hitting the Api Server for Pod CRUD ops like [here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49]. In an env where a lot of connections are kept due to large scale jobs this could be problematic. A lot of long running jobs should not create pod changes eg. Streaming jobs to justify a continuous watching mechanism. Latest Frabric8 client versions have implemented a SharedInformer API+Lister, an example can be found [here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37]. This new API follows the implementation of the official java K8s client and the go counterpart and it is backed up by a caching mechanism which is resynced after a configurble period. There is also a lister that keeps track of current status of resources. The suggestion is to update to v4.13.0 the client (has all updates in wrt that API) and use the informer+lister API where applicable. I think the lister could replace part of the snapshotting/notification mechanism. /cc [~erikerlandson] [~dongjoon] WDYTH? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org