[I] [native_datafusion] [Spark SQL Tests] Nested schema pruning tests fail — CometNativeScan not recognized as file source [datafusion-comet]

via GitHub Wed, 28 Jan 2026 16:55:43 -0800


andygrove opened a new issue, #3318:
URL: https://github.com/apache/datafusion-comet/issues/3318


   ## Summary
   
   ~130 tests in `SchemaPruningSuite` (both "Spark vectorized reader" and 
"Non-vectorized reader" variants, with and without partition data columns) fail 
because `CometNativeScan` is not recognized as a file source scan node.
   
   ## Error Pattern
   
   All failures have the same error:
   ```
   0 did not equal 1 Found 0 file sources in dataframe, but expected 
ArraySeq(struct<...>)
   ```
   
   The test infrastructure looks for `FileSourceScanExec` or `BatchScanExec` 
nodes in the query plan to verify schema pruning. `CometNativeScan` is neither 
of these, so the tests find 0 file sources and fail.
   
   ## Failing Tests
   
   - All `SchemaPruningSuite` tests including: select complex fields, nested 
field pruning, correlated subqueries, case-insensitive schema, generator 
output, Expand/Sort/Window, etc.
   - `Case-insensitive parser` variants from the same suite
   - `SPARK-37450: Prunes unnecessary fields from Explode for count aggregation`
   
   ## Root Cause
   
   `CometNativeScan` doesn't extend or isn't matched by the plan inspection 
utilities that look for file source scan nodes. The tests verify that schema 
pruning pushes the correct pruned schema down to the scan, but can't find the 
scan node to inspect.
   
   This is both a test infrastructure issue (tests don't know about 
`CometNativeScan`) and potentially a functional concern (schema pruning may not 
be happening the same way in native_datafusion).
   
   ## Related
   
   Discovered in CI for #3307 (enable native_datafusion in auto scan mode).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [native_datafusion] [Spark SQL Tests] Nested schema pruning tests fail — CometNativeScan not recognized as file source [datafusion-comet]

Reply via email to