set92 opened a new issue, #782:
URL: https://github.com/apache/sedona/issues/782

   ## Expected behavior
   
   Read a file with incorrect geometries and not fail, or have some parameter 
to define if use make_valid while reading the file, or have some flag in case 
there are some errors while reading to assign it NULL or some value.
   
   So, in the code I show below, I expected for the output to be like this:
   ```
   +-----+------------------------+
   |name |geometry                |
   +-----+------------------------+
   |good |POINT (4.66452 52.13046)|
   |test1|NULL                    |
   |test2|NULL                    |
   +-----+------------------------+
   ```
   
   ## Actual behavior
   
   Instead, I got a fail while reading the file with the error of the last row.
   
   ## Steps to reproduce the problem
   
   This code has 2 mistakes, the second row is missing a ")", and the third one 
has an intersection.
   ```python
   from pyspark.sql import SparkSession, Row
   from pyspark.sql.types import StructType, StructField, StringType
   import findspark
   import pyspark
   
   findspark.find()
   findspark.init()
   spark = SparkSession.builder \
                       .master('local[1]') \
                       .appName('test') \
                       .getOrCreate()
   
   rows = [Row(geometry='POINT(4.66452 52.13046)', name='good'),
          Row(geometry='POINT(4.66452 52.13046', name='test1'),
          Row(geometry='POLYGON((0 -1, 2 1, 2 -1, 0 1))', name='test2')]
   newDf = spark.sparkContext.parallelize(rows)
   dfs2 = spark.createDataFrame(newDf, StructType([StructField('geometry', 
StringType(), True), StructField('name', StringType(), True)]))
   dfs2.createOrReplaceTempView("location")
   
   spark.sql(
         "SELECT name, st_geomFromWKT(geometry) as geometry from location"
   ).show(5)
   ```
   ## Settings
   
   - Sedona version = 1.3.1
   - Apache Spark version = 3.3.2
   - API type = Python
   - Scala version = 2.12
   - JRE version = 1.8, 1.11?
   - Python version = 3.9
   - Environment = Standalone


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to