andygrove opened a new pull request, #4385:
URL: https://github.com/apache/datafusion-comet/pull/4385
## Which issue does this PR close?
Closes #.
## Rationale for this change
After the JVM Parquet reader paths (`native_comet` and
`native_iceberg_compat`) were retired, three Spark-derived helpers were left
behind with no production callers. They were originally needed for the JVM
vectorized reader to clip Spark request schemas against Parquet file schemas
and to build predicate column descriptors; the native Parquet scan does all of
that in Rust now.
The same cleanup pass surfaced something else: every reference to
`native_datafusion` in code, comments, and test names is now misleading. There
is only one scan, so naming it adds no information and makes the code read as
if other options still exist.
## What changes are included in this PR?
- Delete three dead files:
-
`spark/src/main/scala/org/apache/spark/sql/comet/parquet/CometParquetReadSupport.scala`
-
`spark/src/main/scala/org/apache/spark/sql/comet/parquet/CometSparkToParquetSchemaConverter.scala`
(only referenced by the file above)
-
`spark/src/main/java/org/apache/parquet/filter2/predicate/SparkFilterApi.java`
(untouched since the initial PR; no callers)
- Rename `CometScanRule.nativeDataFusionScan` to `nativeScan`.
- Update test names that contained `native_datafusion` to just say `native
scan` (5 tests in `ParquetReadV1Suite`, 1 in `CometTaskMetricsSuite`).
- Update comments in `ParquetReadSuite`, `ParquetEncryptionITCase`,
`CometNativeScan`, `parquet_exec.rs`, and the bug-triage skill that referred to
`native_datafusion` or `native_iceberg_compat`.
Net diff: 10 files, +17 / -767.
## How are these changes tested?
Covered by existing tests; the renames are mechanical and the deleted files
have no callers.
- `ParquetReadV1Suite "native scan rejects"`: 5/5 pass (the renamed
type-mismatch regression tests).
- `make` and `cargo clippy --all-targets --workspace -- -D warnings` both
green.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]