Kontinuation opened a new pull request, #641:
URL: https://github.com/apache/sedona-db/pull/641

   ## Summary
   
   - Adds a `KnnQuerySideFilterPushdown` optimizer rule that automatically 
pushes query-side-only filters below the `SpatialJoinPlanNode` extension node 
for KNN inner joins
   - Only handles `INNER JOIN` (conservative start); outer join support can be 
added later
   - Updates docs to document the automatic pushdown behavior and clarify when 
`barrier()` is still needed
   
   ## Background
   
   Previously, KNN joins blocked ALL filter pushdown (both query-side and 
object-side) because the `SpatialJoinPlanNode` extension node's default 
`prevent_predicate_push_down_columns()` returns all columns. Object-side 
pushdown must remain blocked (it changes KNN candidate sets), but query-side 
pushdown is safe and should be automatic.
   
   DataFusion's built-in `PushDownFilter` pushes the same predicate to ALL 
children of an extension node, so a query-side filter like `h.stars >= 4` would 
fail when applied to the object-side child that doesn't have column `h.stars`. 
This requires a custom optimizer rule instead.
   
   ## Implementation
   
   The `KnnQuerySideFilterPushdown` rule:
   1. Pattern matches `Filter(predicate, Extension(SpatialJoinPlanNode))` where 
the join filter contains `ST_KNN`
   2. Uses `find_knn_query_side()` to determine which child is the query side 
(from the first argument of `ST_KNN`)
   3. Splits the filter predicate into conjuncts; pushes query-side-only 
conjuncts below the extension node; keeps the rest above
   4. Runs **before** DataFusion's `PushDownFilter` so the pushed-down filters 
are further optimized into scan nodes in the same pass
   
   ## Testing
   
   - 12 unit tests for `find_st_knn_call` and `find_knn_query_side`
   - 3 integration tests verifying correct plan structure (filter pushed into 
query-side child, object-side filters stay above)
   
   Depends on #635


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to