Sebastian Eckweiler created SEDONA-59: -----------------------------------------
Summary: Remove explicit pyspark dependency Key: SEDONA-59 URL: https://issues.apache.org/jira/browse/SEDONA-59 Project: Apache Sedona Issue Type: Improvement Reporter: Sebastian Eckweiler The currently published sedona python package has an explicit dependency on pyspark. When used on spark platforms such as Databricks spark comes pre-installed, but not integrated with pip. A `pip install sedona` will thus install another pyspark copy - which in the best case is just superfluous. In the worst case it might cause trouble in combination with the pre-installed spark. Workarounds, such as installing sedona without dependencies can work for a while. But this is fragile: as soon as dependency validation as performed e.g. by setuptools entrypoints comes around it will break. I guess there are two options: * Removing the pyspark dependency completely, considering it to "obvious" * Add pyspark as an optional `extras_require` to an extra called "spark". This would allow a pip install as below, which would get sedona and the corresponding pyspark distribution: {code:java} pip install sedona[spark]{code} I'd be willing to create a corresponding pull-request if one of the option would be accepted. -- This message was sent by Atlassian Jira (v8.3.4#803005)