Hi,
Just FYI that running spark-pipelines in the latest master gives the
following TypeError:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/jacek/oss/spark/python/pyspark/__init__.py", line 129, in
<module>
from pyspark.sql import SQLContext, HiveContext # noqa: F401
File "/Users/jacek/oss/spark/python/pyspark/sql/__init__.py", line 42, in
<module>
from pyspark.sql.types import Row, VariantVal
File "/Users/jacek/oss/spark/python/pyspark/sql/types.py", line 531, in
<module>
class GeographyType(SpatialType):
File "/Users/jacek/oss/spark/python/pyspark/sql/types.py", line 547, in
GeographyType
def __init__(self, srid: int | str):
TypeError: unsupported operand type(s) for |: 'type' and 'type'
My Python project defines: `requires-python = ">=3.12"`
This happens when I source the virtual env and execute spark-pipelines
shell.
PySpark uses 3.10+ (python_requires=">=3.10"
in python/packaging/client/setup.py).
Could the problem be the following lines in spark-pipelines?
# Default to standard python3 interpreter unless told otherwise
if [[ -z "$PYSPARK_PYTHON" ]]; then
PYSPARK_PYTHON=python3
fi
On macos, it gives 3.9.
❯ type python3
python3 is /usr/bin/python3
❯ python3 --version
Python 3.9.6
Should I file an issue (against master)?
BTW, I think GeographyType should be defined in __all__
in pyspark/sql/types.py, shouldn't it?
Pozdrawiam,
Jacek Laskowski
----
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on Bluesky <https://bsky.app/profile/books.japila.pl>
<https://twitter.com/jaceklaskowski>