Hi,

Just FYI that running spark-pipelines in the latest master gives the
following TypeError:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jacek/oss/spark/python/pyspark/__init__.py", line 129, in
<module>
    from pyspark.sql import SQLContext, HiveContext  # noqa: F401
  File "/Users/jacek/oss/spark/python/pyspark/sql/__init__.py", line 42, in
<module>
    from pyspark.sql.types import Row, VariantVal
  File "/Users/jacek/oss/spark/python/pyspark/sql/types.py", line 531, in
<module>
    class GeographyType(SpatialType):
  File "/Users/jacek/oss/spark/python/pyspark/sql/types.py", line 547, in
GeographyType
    def __init__(self, srid: int | str):
TypeError: unsupported operand type(s) for |: 'type' and 'type'

My Python project defines: `requires-python = ">=3.12"`

This happens when I source the virtual env and execute spark-pipelines
shell.

PySpark uses 3.10+ (python_requires=">=3.10"
in python/packaging/client/setup.py).

Could the problem be the following lines in spark-pipelines?

# Default to standard python3 interpreter unless told otherwise
if [[ -z "$PYSPARK_PYTHON" ]]; then
  PYSPARK_PYTHON=python3
fi

On macos, it gives 3.9.

❯ type python3
python3 is /usr/bin/python3

❯ python3 --version
Python 3.9.6

Should I file an issue (against master)?

BTW, I think GeographyType should be defined in __all__
in pyspark/sql/types.py, shouldn't it?

Pozdrawiam,
Jacek Laskowski
----
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on Bluesky <https://bsky.app/profile/books.japila.pl>

<https://twitter.com/jaceklaskowski>

Reply via email to