[ 
https://issues.apache.org/jira/browse/SPARK-33737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated SPARK-33737:
----------------------------------------
    Description: 
Kubernetes backend uses Fabric8 client and a 
[watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42]
 to monitor the K8s Api server for pod changes. Every watcher keeps a websocket 
connection open and has no caching mechanism at that part. Caching at the Spark 
K8s resource manager exists in other areas where we are hitting the Api Server 
for Pod CRUD ops like 
[here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49].


In an env where a lot of connections are kept due to large scale jobs this 
could be problematic and impose a lot of load against the API server. A lot of 
long running jobs should not create pod changes eg. Streaming jobs to justify a 
continuous watching mechanism.

Latest Frabric8 client versions have implemented a SharedInformer API+Lister, 
an example can be found 
[here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37].
This new API follows the implementation of the official java K8s client and the 
go counterpart and it is backed up by a caching mechanism which is re-synced 
after a configurable period to avoid hitting the API server all the time. There 
is also a lister that keeps track of current status of resources. Using such a 
mechanism is common place when implementing a K8s controller.
The suggestion is to update to v4.13.0 the client (has all updates in wrt that 
API) and use the informer+lister API where applicable. 
I think the lister could also replace part of the snapshotting/notification 
mechanism.

/cc [~dongjoon] [~eje] [~holden] WDYTH?
 

  was:
Kubernetes backend uses Fabric8 client and a 
[watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42]
 to monitor the K8s Api server for pod changes. Every watcher keeps a websocket 
connection open and has no caching mechanism at that part. Caching at the K8s 
resource manager exists in other areas where we are hitting the Api Server for 
Pod CRUD ops like 
[here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49].


In an env where a lot of connections are kept due to large scale jobs this 
could be problematic and impose a lot of load against the API server.
A lot of long running jobs should not create pod changes eg. Streaming jobs to 
justify a continuous watching mechanism.

Latest Frabric8 client versions have implemented a SharedInformer API+Lister, 
an example can be found 
[here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37].
This new API follows the implementation of the official java K8s client and the 
go counterpart and it is backed up by a caching mechanism which is re-synced 
after a configurable period to avoid hitting the API server all the time. There 
is also a lister that keeps track of current status of resources. Using such a 
mechanism is common place when implementing a K8s controller.
The suggestion is to update to v4.13.0 the client (has all updates in wrt that 
API) and use the informer+lister API where applicable. 
I think the lister could also replace part of the snapshotting/notification 
mechanism.

/cc [~dongjoon] [~eje] [~holden] WDYTH?
 


> Use an Informer+Lister API in the ExecutorPodWatcher
> ----------------------------------------------------
>
>                 Key: SPARK-33737
>                 URL: https://issues.apache.org/jira/browse/SPARK-33737
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.0.2
>            Reporter: Stavros Kontopoulos
>            Priority: Major
>
> Kubernetes backend uses Fabric8 client and a 
> [watch|https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala#L42]
>  to monitor the K8s Api server for pod changes. Every watcher keeps a 
> websocket connection open and has no caching mechanism at that part. Caching 
> at the Spark K8s resource manager exists in other areas where we are hitting 
> the Api Server for Pod CRUD ops like 
> [here|https://github.com/apache/spark/blob/b8ccd755244d3cd8a81a9f4a1eafa2a4e48759d2/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L49].
> In an env where a lot of connections are kept due to large scale jobs this 
> could be problematic and impose a lot of load against the API server. A lot 
> of long running jobs should not create pod changes eg. Streaming jobs to 
> justify a continuous watching mechanism.
> Latest Frabric8 client versions have implemented a SharedInformer API+Lister, 
> an example can be found 
> [here|https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/InformerExample.java#L37].
> This new API follows the implementation of the official java K8s client and 
> the go counterpart and it is backed up by a caching mechanism which is 
> re-synced after a configurable period to avoid hitting the API server all the 
> time. There is also a lister that keeps track of current status of resources. 
> Using such a mechanism is common place when implementing a K8s controller.
> The suggestion is to update to v4.13.0 the client (has all updates in wrt 
> that API) and use the informer+lister API where applicable. 
> I think the lister could also replace part of the snapshotting/notification 
> mechanism.
> /cc [~dongjoon] [~eje] [~holden] WDYTH?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to