pszhuchkov opened a new issue, #1237:
URL: https://github.com/apache/sedona/issues/1237
## Expected behavior
I want to define a schema that includes `GeometryType` for a dataframe after
using RDD map function.
## Actual behavior
`ValueError: field geom: <shapely.geometry.point.Point object at
0x7fa204b85750> is not an instance of type GeometryType()`
## Steps to reproduce the problem
Adjusted test case from tutorials
```Python
from pyspark.sql.types import IntegerType, StructField, StructType
from sedona.sql.types import GeometryType
schema = StructType(
[
StructField("id", IntegerType(), False),
StructField("geom", GeometryType(), False)
]
)
from shapely.geometry import Point
data = [
[1, Point(21.0, 52.0)],
[1, Point(23.0, 42.0)],
[1, Point(26.0, 32.0)]
]
gdf = spark.createDataFrame(
data,
schema
)
gdf.show()
def dummy_map(row):
# some logic here
return row
test_df = gdf.rdd.map(dummy_map).toDF(gdf.schema)
# or test_df = gdf.rdd.map(dummy_map).toDF(schema)
```
## Settings
Sedona version = 1.5.1
Apache Spark version = Spark 3.3.0
API type = Python
Environment = EMR 6.9.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]