Kontinuation opened a new pull request, #1540: URL: https://github.com/apache/sedona/pull/1540
## Did you read the Contributor Guide? - Yes, I have read the [Contributor Rules](https://sedona.apache.org/latest-snapshot/community/rule/) and [Contributor Development Guide](https://sedona.apache.org/latest-snapshot/community/develop/) ## Is this PR related to a JIRA ticket? - Yes, the URL of the associated JIRA ticket is https://issues.apache.org/jira/browse/SEDONA-637. The PR name follows the format `[SEDONA-XXX] my subject`. ## What changes were proposed in this PR? Spatial filters pushed down to the GeoParquet scan node are visible in the query plan. For example, the following query ```python df.where("ST_Intersects(geometry, ST_Point(1, 1))").explain() ``` Produces the following query plan ``` == Physical Plan == Filter (isnotnull(geometry#218) AND **org.apache.spark.sql.sedona_sql.expressions.ST_Intersects** ) +- FileScan geoparquet [id#217L,geometry#218,bbox#219] Batched: false, DataFilters: [isnotnull(geometry#218), **org.apache.spark.sql.sedona_sql.expressions.ST_Intersects** ], Format: GeoParquet with spatial filter [geometry INTERSECTS POINT (1 1)], Location: InMemoryFileIndex(1 paths).., PartitionFilters: [], PushedFilters: [IsNotNull(geometry)], ReadSchema: struct<id:bigint,geometry:binary,bbox:struct<xmin:double,ymin:double,xmax:double,ymax:double>> ``` The spatial filters pushed down to GeoParquet scan is shown in `Format: GeoParquet with spatial filter [...]`. Spatial filter push-down can be manually disabled by configuring the Spark configuration `spark.sedona.geoparquet.spatialFilterPushDown` to `false`: ``` spark.conf.set("spark.sedona.geoparquet.spatialFilterPushDown", "false") df.where("ST_Intersects(geometry, ST_Point(1, 1))").explain() ``` ``` == Physical Plan == Filter (isnotnull(geometry#218) AND **org.apache.spark.sql.sedona_sql.expressions.ST_Intersects** ) +- FileScan geoparquet [id#217L,geometry#218,bbox#219] Batched: false, DataFilters: [isnotnull(geometry#218), **org.apache.spark.sql.sedona_sql.expressions.ST_Intersects** ], Format: GeoParquet, Location: InMemoryFileIndex(1 paths).., PartitionFilters: [], PushedFilters: [IsNotNull(geometry)], ReadSchema: struct<id:bigint,geometry:binary,bbox:struct<xmin:double,ymin:double,xmax:double,ymax:double>> ``` ## How was this patch tested? Pass newly added tests ## Did this PR include necessary documentation updates? - Yes, I have updated the documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org