zhongwei liu created SPARK-31173: ------------------------------------ Summary: Spark Kubernetes add tolerations and nodeName support Key: SPARK-31173 URL: https://issues.apache.org/jira/browse/SPARK-31173 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 3.1.0, 2.4.6 Environment: Alibaba Cloud ACK with spark operator(v1beta2-1.1.0-2.4.5) and spark(2.4.5) Reporter: zhongwei liu
When you run spark on serverless kubernetes cluster(virtual-kubelet). you need to specific the nodeSelectors,tolerations even nodeName when you want to gain better scheduling performance. Currently spark doesn't support tolerations. If you want to use this feature, You must use admission controller webhook to decorate the pod. But the performance is extremely bad. Here is the benchmark. With webhook Batch Size: 500 Pod creation: about 7 Pods/s All Pods running: 5min Without webhook Batch Size: 500 Pod creation: more than 500 Pods/s All Pods running: 45s Adding tolerations and nodeName in spark will bring great help when you want to run a large scale job on serverless kubernetes cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org