jiayuasu commented on PR #693: URL: https://github.com/apache/incubator-sedona/pull/693#issuecomment-1254609739
@douglasdennis Another interesting finding: since the type-safe dataframe APIs do not need to call "udf.register()" to register all functions, is it possible that, as a side effect, predicate pushdown is finally supported in Sedona? If you don't know what a predicate pushdown is, see this: https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-Optimizer-PushDownPredicate.html Since Sedona 1.3.0 will natively read GeoParquet (PR already merged to the master), Sedona user RJ Marcus is working hard to get the predicate pushdown work on GeoParquet. See [SEDONA-156](https://issues.apache.org/jira/browse/SEDONA-156). Quoted my reply to RJ Marcus: > Pushed filter: UDF function can be pushed down as a filter. Based on my understanding, Sedona ST functions cannot be pushed down because UDFs in pure Spark SQL are blackbox to Spark catalyst unless we do something with the current Sedona ST functions. > However, Sedona implements all functions in Spark SQL Catalyst "Expressions" [1] instead of the naive UDF. This gives you the possibility to push them down to the data source (see [2]). There is an ongoing effort to enable Sedona ST functions in type-safe format which bypasses the "udf.register" step (see [3]) > So, with the current Sedona GeoParquet reader, and [3], it is possible that the Pushed filter will be finally supported. You might want to check it out and confirm my wild guess. > [1] https://github.com/apache/incubator-sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Constructors.scala#L45 > [2] https://neapowers.com/apache-spark/native-functions-catalyst-expressions/ > [3] https://github.com/apache/incubator-sedona/pull/693 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
