Kontinuation commented on code in PR #1208:
URL: https://github.com/apache/sedona/pull/1208#discussion_r1467198376


##########
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/JoinQueryDetector.scala:
##########
@@ -140,6 +134,18 @@ class JoinQueryDetector(sparkSession: SparkSession) 
extends Strategy {
         case Some(And(extraCondition, predicate: RS_Predicate)) =>
           getRasterJoinDetection(left, right, predicate, Some(extraCondition))
         // For distance joins we execute the actual predicate (condition) and 
not only extraConditions.
+        case Some(ST_DWithin(Seq(leftShape, rightShape, distance))) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = false, condition, Some(distance)))
+        case Some(And(ST_DWithin(Seq(leftShape, rightShape, distance)), _)) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = false, condition, Some(distance)))
+        case Some(And(_, ST_DWithin(Seq(leftShape, rightShape, distance)))) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = false, condition, Some(distance)))
+        case Some(ST_DWithin(Seq(leftShape, rightShape, distance, 
useSpheroid))) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = 
useSpheroid.eval().asInstanceOf[Boolean], condition, Some(distance)))
+        case Some(And(ST_DWithin(Seq(leftShape, rightShape, distance, 
useSpheroid)), _)) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = 
useSpheroid.eval().asInstanceOf[Boolean], condition, Some(distance)))
+        case Some(And(_, ST_DWithin(Seq(leftShape, rightShape, distance, 
useSpheroid)))) =>
+          Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, isGeography = 
useSpheroid.eval().asInstanceOf[Boolean], condition, Some(distance)))

Review Comment:
   `useSpheroid.eval()` will raise an exception when `useSpheroid` cannot be 
simplified as a numeric literal. Here is an example:
   
   ```python
   df_point = spark.range(10).withColumn("pt", expr("ST_Point(id, id)"))
   df_polygon = spark.range(10).withColumn("poly", expr("ST_Point(id, id + 
0.01)"))
   df_point.alias("a").join(df_polygon.alias("b"), expr("ST_DWithin(pt, poly, 
10000, a.`id` % 2 = 0)")).show()
   ```
   
   This query fails with the following message:
   
   ```
   Py4JJavaError: An error occurred while calling o400.showString.
   : org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot evaluate 
expression: id#1026L
        at 
org.apache.spark.SparkException$.internalError(SparkException.scala:92)
        at 
org.apache.spark.SparkException$.internalError(SparkException.scala:96)
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.cannotEvaluateExpressionError(QueryExecutionErrors.scala:65)
        at 
org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:385)
        at 
org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:384)
        at 
org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:260)
        at 
org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:670)
        at 
org.apache.spark.sql.catalyst.expressions.DivModLike.eval$(arithmetic.scala:664)
        at 
org.apache.spark.sql.catalyst.expressions.Remainder.eval(arithmetic.scala:930)
        at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:664)
        at 
org.apache.spark.sql.sedona_sql.strategy.join.JoinQueryDetector.apply(JoinQueryDetector.scala:144)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
   
   ```
   
   We can fall back to cartesian join in this case since it is not quite 
common, but we should not prevent such queries from running.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to