Hi,
Have you set python environment variables correctly?
PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON?
You can print the environment variables within your PySpark script to
verify this:
import os
print("PYTHONPATH:", os.environ.get("PYTHONPATH"))
print("PYSPARK_PYTHON:", os.environ.get("PYSPARK_PYT
That did not paste well, let me try again
I am using python3.7 and spark 2.4.7
I am trying to figure out why my job is using the wrong python version
This is how it is starting up the logs confirm that I am using python 3.7But I
later see the error message showing it is trying to us 3.8, and I a
I am using python3.7 and spark 2.4.7
I am trying to figure out why my job is using the wrong python version
This is how it is starting up the logs confirm that I am using python 3.7But I
later see the error message showing it is trying to us 3.8, and I am not sure
where it is picking that up.
SP
Hello Mich,
Thank you for your questions. Here are my responses:
> 1. What investigation have you done to show that it is running in local
mode?
I have verified through the History Server's Environment tab that:
- "spark.master" is set to local[*]
- "spark.app.id" begins with local-xxx
- "spark.s
personally I have not used this feature myself. However, some points
1. What investigation have you done to show that it is running in local
mode?
2. who has configured this kubernetes cluster? Is it supplied by a cloud
vendor?
3. Confirm that you have configured Spark Connect Serv