andygrove opened a new issue, #4298: URL: https://github.com/apache/datafusion-comet/issues/4298
## Background `spark.comet.schemaEvolution.enabled` is an internal config that gates whether Comet's Parquet scan paths permit certain widening type promotions (`INT32 -> INT64`, `FLOAT -> DOUBLE`, and on Spark 4+ `INT32 -> DOUBLE`). Defaults today (via `ShimCometConf`): - Spark 3.x: `false` - Spark 4.x: `true` @mbutrovich raised the question in https://github.com/apache/datafusion-comet/pull/4229#issuecomment-4391539944: > why do we have `spark.comet.schemaEvolution.enabled` config anymore? Maybe we should deprecate that first and help us simplify the story. I think it's legacy from when Comet's Parquet decoder could be called from Iceberg, which has different schema evolution semantics. ## Why this is worth investigating The flag now exists primarily to model the *Spark version's* permissiveness rather than a user-tunable knob: Spark 3.x rejects these widenings, Spark 4.x accepts them. If that's its only purpose post Iceberg-decoder removal, the per-version default already encodes the right answer and a user-tunable internal config adds little besides surface area to reason about. ## Scope of this issue Investigate and report back: 1. Are there any code paths today that flip this flag away from the per-version default (Iceberg integration, tests, callers outside the Comet codebase)? 2. Does keeping the flag enable any *correct* behavior that we'd lose by hardcoding per-version defaults? 3. If neither (1) nor (2), propose a deprecation path: rename to a non-tunable internal constant, fold the check into the version-specific shim, and update the contributor docs. No code changes required up front; a writeup on the above is enough to decide next steps. ## Related - #4229 (current PR, which uses this flag to gate the three widening-rejection cases on Spark 3.x) - #4297 (broader gap of unconditionally-rejected primitive conversions that this flag does *not* govern) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
