jiayuasu commented on code in PR #1208:
URL: https://github.com/apache/sedona/pull/1208#discussion_r1460174600


##########
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/JoinQueryDetector.scala:
##########
@@ -132,6 +132,12 @@ class JoinQueryDetector(sparkSession: SparkSession) 
extends Strategy {
           Some(JoinQueryDetection(left, right, leftShape, 
ST_Buffer(Seq(rightShape, distance)), SpatialPredicate.INTERSECTS, isGeography 
= false, condition, Some(extraCondition)))
         case Some(And(extraCondition, ST_DWithin(Seq(leftShape, rightShape, 
distance)))) =>
           Some(JoinQueryDetection(left, right, leftShape, 
ST_Buffer(Seq(rightShape, distance)), SpatialPredicate.INTERSECTS, isGeography 
= false, condition, Some(extraCondition)))
+        case Some(ST_DWithin(Seq(leftShape, rightShape, distance, 
useSpheroid))) =>

Review Comment:
   When `useSpheroid` is `true` , this should not use ST_Buffer because 
buffering a lon/lat geometry with meter distance does not make any sense and 
will lead to wrong result. When `useSpheroid = true`, we should make it 
equivalent to `ST_DistanceSpheroid() < XXX`.
   
   I think we should totally get rid of ST_Buffer to avoid confusion. All 
`ST_DWithin` patterns, once get matched, should be translated to distance join, 
like what we did for ST_Distance and ST_DistanceSpheroid.  Buffering a geometry 
with a spheroid distance is already taken care by the optimized distance join 
triggered `ST_DistanceSpheroid() < XXX`, which is quite complex.
   
   ```
        case Some(LessThanOrEqual(ST_Distance(Seq(leftShape, rightShape)), 
distance)) =>
                  Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, false, condition, Some(distance)))
           // ST_DistanceSpheroid
           case Some(LessThanOrEqual(ST_DistanceSpheroid(Seq(leftShape, 
rightShape)), distance)) =>
             Some(JoinQueryDetection(left, right, leftShape, rightShape, 
SpatialPredicate.INTERSECTS, true, condition, Some(distance)))
   ```
   



##########
docs/api/flink/Predicate.md:
##########
@@ -60,9 +60,11 @@ true
 
 ## ST_DWithin
 
-Introduction: Returns true if 'leftGeometry' and 'rightGeometry' are within a 
specified 'distance'. This function essentially checks if the shortest distance 
between the envelope of the two geometries is <= the provided distance.
+Introduction: Returns true if 'leftGeometry' and 'rightGeometry' are within a 
specified 'distance' (in metres). This function essentially checks if the 
shortest distance between the envelope of the two geometries is <= the provided 
distance.

Review Comment:
   Please change every `Sphere` to `Spheroid` in the doc and other code. In the 
geodesic distance world, `Sphere distance` means great circle distance while 
`spheroid distance` means geodesic distance. These are 2 different things.
   
   In our case, we use `spheroid` not `sphere`.
   
   In addition, it is worth noting that when we use `spheroid` distance, there 
is a current limitation in Sedona: we calculate the distance between the 
centroids of two geometries, not the shortest distance between two geometries. 
This tiny error usually does not matter much to end users because spheroid 
distances are mainly used for getting the distance between two objects across 
the globe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to