Re: [PR] chore: Remove config option for `native_iceberg_compat` [datafusion-comet]

via GitHub Wed, 22 Apr 2026 18:32:37 -0700


parthchandra commented on PR #4019:
URL: 
https://github.com/apache/datafusion-comet/pull/4019#issuecomment-4301099715


   My general recommendation would be that we enable ignored tests before 
dropping `native_iceberg_compat`.
   Also, the description of #3720  seems to indicate that issue is more complex 
than just mismatched error messages (I can be convinced that it is not a 
serious problem after all). We do have the framework to match Spark error 
messages but perhaps it is not necessary to have error messages match exactly.
   
   Here's Claude's summary of ignored tests - 
   
     ### 1. `IgnoreCometNativeDataFusion` — skipped for native_datafusion and 
auto
   
     #### AdaptiveQueryExecSuite 
([#3321](https://github.com/apache/datafusion-comet/issues/3321))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `join key with multiple references on the filtering plan` | 4.0.1 |
     | `SPARK-43402: FileSourceScanExec supports push down data filter with 
scalar subquery` | 4.0.1 |
     | `alter temporary view should follow current storeAnalyzedPlanForView 
config` | 4.0.1 |
   
     #### AdaptiveQueryExecSuite 
([#3442](https://github.com/apache/datafusion-comet/issues/3442))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `static scan metrics` | 3.4.3, 3.5.8, 4.0.1 |
   
     #### FileBasedDataSourceSuite 
([#3321](https://github.com/apache/datafusion-comet/issues/3321))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `Enabling/disabling ignoreMissingFiles using parquet` (conditionally 
tagged only when `format == "parquet"`) | 4.0.1 |
     | `Enabling/disabling ignoreCorruptFiles` | 4.0.1 |
   
     #### ParquetFilterSuite (3.4.3 only)
   
     | Test Name | Diffs |
     |-----------|-------|
     | `filter pushdown - StringPredicate` (tagged 
`IgnoreCometNativeDataFusion` in 3.4.3; `IgnoreCometNativeScan` in 3.5.8/4.0.1) 
| 3.4.3 |
   
     #### ParquetSchemaSuite 
([#3720](https://github.com/apache/datafusion-comet/issues/3720))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `SPARK-35640: read binary as timestamp should throw schema incompatible 
error` | 3.4.3, 3.5.8, 4.0.1 |
     | `SPARK-35640: int as long should throw schema incompatible error` | 
3.4.3, 3.5.8 |
     | `SPARK-47447: read TimestampLTZ as TimestampNTZ` | 4.0.1 |
     | `SPARK-36182: can't read TimestampLTZ as TimestampNTZ` | 3.4.3, 3.5.8 |
     | `SPARK-34212 Parquet should read decimals correctly` | 3.4.3, 3.5.8, 
4.0.1 |
     | `row group skipping doesn't overflow when reading into larger type` | 
3.4.3, 3.5.8, 4.0.1 |
   
     #### ParquetSchemaEvolutionSuite 
([#3720](https://github.com/apache/datafusion-comet/issues/3720))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `schema mismatch failure error message for parquet vectorized reader` | 
3.4.3, 3.5.8, 4.0.1 |
     | `SPARK-45604: schema mismatch failure error on timestamp_ntz to 
array<timestamp_ntz>` | 3.4.3, 3.5.8, 4.0.1 |
   
     #### ParquetTypeWideningSuite 
([#3321](https://github.com/apache/datafusion-comet/issues/3321))
   
     | Test Name | Diffs |
     |-----------|-------|
     | `parquet widening conversion DateType -> TimestampNTZType` 
(conditionally tagged) | 4.0.1 |
     | `unsupported parquet conversion $fromType -> $toType` (multiple type 
combos) | 4.0.1 |
     | `unsupported parquet timestamp conversion $fromType 
($outputTimestampType) -> $toType` | 4.0.1 |
     | `parquet decimal precision change Decimal($fromPrecision, 2) -> 
Decimal($toPrecision, 2)` | 4.0.1 |
     | `parquet decimal precision and scale change Decimal($fromPrecision, 
$fromScale) -> Decimal($toPrecision, $toScale)` | 4.0.1 |
   
     ---
   
     ### 2. `assume()` — runtime skip
   
     #### ParquetRowIndexSuite 
([#3886](https://github.com/apache/datafusion-comet/issues/3886)) — 4.0.1 only
   
     | Test Name | Condition |
     |-----------|-----------|
     | `invalid row index column type - ${conf.desc}` | Skipped when 
`COMET_NATIVE_SCAN_IMPL` is `SCAN_NATIVE_DATAFUSION` or `SCAN_AUTO`. Comet 
throws `RuntimeException` instead of `SparkException`. |
   
     #### CometExpressionSuite — Comet's own test suite
   
     | Test Name | Condition |
     |-----------|-----------|
     | `get_struct_field - select primitive fields` | Skipped when `scanImpl == 
SCAN_AUTO && Spark 4.0+` |
     | `get_struct_field - select subset of struct` | Skipped when `scanImpl == 
SCAN_AUTO && Spark 4.0+` |
     | `get_struct_field - read entire struct` | Skipped when `scanImpl == 
SCAN_AUTO && Spark 4.0+` |
   
     ---
   
     ### Summary by Tracking Issue
   
     | Issue | Count | Description |
     |-------|-------|-------------|
     | [#3321](https://github.com/apache/datafusion-comet/issues/3321) | ~12 | 
Schema evolution, corrupt/missing files, AQE, type widening |
     | [#3720](https://github.com/apache/datafusion-comet/issues/3720) | ~8 | 
Schema mismatch errors, decimal reads, row group skipping |
     | [#3442](https://github.com/apache/datafusion-comet/issues/3442) | 1 | 
Static scan metrics with DPP |
     | [#3886](https://github.com/apache/datafusion-comet/issues/3886) | 1 | 
Row index column type error type mismatch |
     | (no issue) | 5 | Filter pushdown / accumulator tests 
(`IgnoreCometNativeScan`) |
     | (no issue) | 3 | `get_struct_field` tests (auto + Spark 4.0+ only) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: Remove config option for `native_iceberg_compat` [datafusion-comet]

Reply via email to