[ https://issues.apache.org/jira/browse/SPARK-34349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280382#comment-17280382 ]
Hyukjin Kwon commented on SPARK-34349: -------------------------------------- I think it's fined in the upstream. It would be great if you can have a chance to test Spark 3.1.1 RC and see if it's fixed. > No python3 in docker images > ---------------------------- > > Key: SPARK-34349 > URL: https://issues.apache.org/jira/browse/SPARK-34349 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 3.0.1 > Reporter: Tim Hughes > Priority: Critical > > The spark-py container image doesn't receive the instruction to use python3 > and defaults to python 2.7 > > The worker container was build using the following commands > {code:java} > mkdir ./tmp > wget -qO- > https://www.mirrorservice.org/sites/ftp.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz > | tar -C ./tmp/ -xzf - > cd ../spark-3.0.1-bin-hadoop3.2/ > ./bin/docker-image-tool.sh -r docker.io/timhughes -t > spark-3.0.1-bin-hadoop3.2 -p > kubernetes/dockerfiles/spark/bindings/python/Dockerfile build > docker push docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2{code} > > This is the code I am using to initialize the workers > > {code:java} > import os > from pyspark import SparkContext, SparkConf > from pyspark.sql import SparkSession# Create Spark config for our Kubernetes > based cluster manager > sparkConf = SparkConf() > sparkConf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443") > sparkConf.setAppName("spark") > sparkConf.set("spark.kubernetes.container.image", > "docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2") > sparkConf.set("spark.kubernetes.namespace", "spark") > sparkConf.set("spark.executor.instances", "2") > sparkConf.set("spark.executor.cores", "1") > sparkConf.set("spark.driver.memory", "1024m") > sparkConf.set("spark.executor.memory", "1024m") > sparkConf.set("spark.kubernetes.pyspark.pythonVersion", "3") > sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName", > "spark") > sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark") > sparkConf.set("spark.driver.port", "29413") > sparkConf.set("spark.driver.host", > "my-notebook-deployment.spark.svc.cluster.local") > # Initialize our Spark cluster, this will actually > # generate the worker nodes. > spark = SparkSession.builder.config(conf=sparkConf).getOrCreate() > sc = spark.sparkContext > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org