Kontinuation opened a new pull request, #611:
URL: https://github.com/apache/sedona-db/pull/611

   ## Summary
   
   KNN joins have different semantics than regular spatial joins — pushing 
filters to the object (build) side changes which objects are the k nearest 
neighbors, producing incorrect results. DataFusion's builtin `PushDownFilter` 
optimizer rule doesn't know this and incorrectly pushes filters through KNN 
joins.
   
   This PR adds a `KnnJoinEarlyRewrite` optimizer rule that converts KNN joins 
to `SpatialJoinPlanNode` extension nodes **before** DataFusion's 
`PushDownFilter` rule runs. Extension nodes naturally block filter pushdown via 
`prevent_predicate_push_down_columns()` returning all columns.
   
   ## Changes
   
   - **New `KnnJoinEarlyRewrite` optimizer rule** — handles two patterns:
     1. `Join(filter=ST_KNN(...))` — when the ON clause has only the spatial 
predicate
     2. `Filter(ST_KNN(...), Join(on=[...]))` — when the ON clause also has 
equi-join conditions (DataFusion's SQL planner separates these)
   - **Positional rule insertion** — `MergeSpatialProjectionIntoJoin` and 
`KnnJoinEarlyRewrite` are inserted before `PushDownFilter`, while 
`SpatialJoinLogicalRewrite` (for non-KNN joins) remains after so non-KNN joins 
still benefit from filter pushdown
   - **Updated `SpatialJoinLogicalRewrite`** — skips KNN joins (already handled 
by the early rewrite)
   - **Integration tests** verifying that object-side filters are NOT pushed 
down for KNN joins, but ARE pushed down for non-KNN spatial joins
   
   ## Rule ordering
   
   ```
   ... → MergeSpatialProjectionIntoJoin → KnnJoinEarlyRewrite → PushDownFilter 
→ ... → SpatialJoinLogicalRewrite
   ```
   
   Closes https://github.com/apache/sedona-db/issues/605


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to