set92 opened a new issue, #782:
URL: https://github.com/apache/sedona/issues/782
## Expected behavior
Read a file with incorrect geometries and not fail, or have some parameter
to define if use make_valid while reading the file, or have some flag in case
there are some errors while reading to assign it NULL or some value.
So, in the code I show below, I expected for the output to be like this:
```
+-----+------------------------+
|name |geometry |
+-----+------------------------+
|good |POINT (4.66452 52.13046)|
|test1|NULL |
|test2|NULL |
+-----+------------------------+
```
## Actual behavior
Instead, I got a fail while reading the file with the error of the last row.
## Steps to reproduce the problem
This code has 2 mistakes, the second row is missing a ")", and the third one
has an intersection.
```python
from pyspark.sql import SparkSession, Row
from pyspark.sql.types import StructType, StructField, StringType
import findspark
import pyspark
findspark.find()
findspark.init()
spark = SparkSession.builder \
.master('local[1]') \
.appName('test') \
.getOrCreate()
rows = [Row(geometry='POINT(4.66452 52.13046)', name='good'),
Row(geometry='POINT(4.66452 52.13046', name='test1'),
Row(geometry='POLYGON((0 -1, 2 1, 2 -1, 0 1))', name='test2')]
newDf = spark.sparkContext.parallelize(rows)
dfs2 = spark.createDataFrame(newDf, StructType([StructField('geometry',
StringType(), True), StructField('name', StringType(), True)]))
dfs2.createOrReplaceTempView("location")
spark.sql(
"SELECT name, st_geomFromWKT(geometry) as geometry from location"
).show(5)
```
## Settings
- Sedona version = 1.3.1
- Apache Spark version = 3.3.2
- API type = Python
- Scala version = 2.12
- JRE version = 1.8, 1.11?
- Python version = 3.9
- Environment = Standalone
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]