Tim Hughes created SPARK-34349: ---------------------------------- Summary: No python3 in docker images Key: SPARK-34349 URL: https://issues.apache.org/jira/browse/SPARK-34349 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.0.1 Reporter: Tim Hughes
The spark-py container image doesn't receive the instruction to use python3 and defaults to python 2.7 The worker container was build using the following commands {code:java} mkdir ./tmp wget -qO- https://www.mirrorservice.org/sites/ftp.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz | tar -C ./tmp/ -xzf - cd ../spark-3.0.1-bin-hadoop3.2/ ./bin/docker-image-tool.sh -r docker.io/timhughes -t spark-3.0.1-bin-hadoop3.2 -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile build docker push docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2{code} This is the code I am using to initialize the workers {code:java} import os from pyspark import SparkContext, SparkConf from pyspark.sql import SparkSession# Create Spark config for our Kubernetes based cluster manager sparkConf = SparkConf() sparkConf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443") sparkConf.setAppName("spark") sparkConf.set("spark.kubernetes.container.image", "docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2") sparkConf.set("spark.kubernetes.namespace", "spark") sparkConf.set("spark.executor.instances", "2") sparkConf.set("spark.executor.cores", "1") sparkConf.set("spark.driver.memory", "1024m") sparkConf.set("spark.executor.memory", "1024m") sparkConf.set("spark.kubernetes.pyspark.pythonVersion", "3") sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName", "spark") sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark") sparkConf.set("spark.driver.port", "29413") sparkConf.set("spark.driver.host", "my-notebook-deployment.spark.svc.cluster.local") # Initialize our Spark cluster, this will actually # generate the worker nodes. spark = SparkSession.builder.config(conf=sparkConf).getOrCreate() sc = spark.sparkContext {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org