andygrove opened a new pull request, #3694:
URL: https://github.com/apache/datafusion-comet/pull/3694
## Which issue does this PR close?
Closes #3313.
## Rationale for this change
`CometNativeScan.isDynamicPruningFilter` only checked for
`PlanExpression[_]` in partition filters to detect Dynamic Partition Pruning
(DPP). This catches dynamic DPP (where subqueries are still present), but
misses static DPP where Spark has already resolved the pruning expression to a
literal wrapped in `DynamicPruningExpression(Literal.TrueLiteral)`. When static
DPP slips through, `native_datafusion` replaces the scan but cannot properly
handle DPP-related plan structures, causing test failures ("static scan
metrics", "explain formatted - check presence of subquery in case of DPP") and
potential runtime errors.
## What changes are included in this PR?
- **`CometNativeScan.scala`**: Updated `isDynamicPruningFilter` to also
check for `DynamicPruningExpression` at the top level, catching both dynamic
and static DPP. This ensures `native_datafusion` always falls back when any
form of DPP is present in partition filters.
- **`CometExecSuite.scala`**: Added test "DPP fallback with
native_datafusion scan" that verifies `native_datafusion` correctly falls back
when DPP is detected (even with `COMET_DPP_FALLBACK_ENABLED=false`, since
detection happens independently in `CometNativeScan.isSupported`).
- **`dev/diffs/3.5.8.diff`**: Removed `IgnoreCometNativeDataFusion` skip
tags for issue #3313 from "static scan metrics" and "explain formatted - check
presence of subquery in case of DPP" tests, since the fix makes these tests
pass with `native_datafusion`. Diff regenerated via the documented Spark clone
workflow.
## How are these changes tested?
- New unit test "DPP fallback with native_datafusion scan" in
`CometExecSuite` validates the fix
- Existing "DPP fallback" test continues to pass
- Spark SQL tests previously skipped via `IgnoreCometNativeDataFusion` tags
should now pass in CI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]