discord9 opened a new issue, #22142: URL: https://github.com/apache/datafusion/issues/22142
### Describe the bug The `unwrap_cast` optimization in `ExprSimplifier` (both logical and physical) rewrites patterns like `CAST(column AS target_type) <op> literal_target` into `column <op> CAST(literal_target AS source_type)` without verifying that the rewrite preserves query semantics. This default-allow behavior silently produces incorrect results for many common cast patterns. ### Specific broken scenarios **Timestamp precision downcast (many→one rewriting):** ```sql -- ts_ns has values 1704067200001000000 and 1704067200001234567 -- Both cast to the same timestamp(ms) SELECT * FROM t WHERE CAST(ts_ns AS timestamp(ms)) = timestamp_ms_value; -- Rewrites to: WHERE ts_ns = <nanosecond-literal> -- Result: only matches ONE nanosecond value, not both → WRONG row count ``` **Source domain not preserved (overflow/null loss):** ```sql -- column1 is Int64, literal is 9999999999 (doesn't fit in Int32) SELECT * FROM t WHERE CAST(column1 AS Int32) < 9999999999; -- Rewrites to: WHERE column1 < CAST(9999999999 AS Int32) -- The literal cast silently truncates → WRONG predicate ``` **Type semantics changed (string round-trip, float truncation):** ```sql SELECT * FROM t WHERE CAST(int_col AS Utf8) = '0123'; -- Rewrites to: WHERE int_col = 0123 -- '0123' is NOT equivalent to 123 → WRONG comparison ``` **Timezone silently dropped:** ```sql SELECT * FROM t WHERE CAST(ts AS timestamp(ms, 'UTC')) = timestamp_literal; -- Rewrites without preserving the timezone → semantics changed ``` ### Additional affected patterns - Int→UInt (negative values become NULL, rewrite doesn't account) - UInt64→Int64 (large UInt64 values above Int64::MAX) - Decimal scale downcast (fractional digits truncated) - Decimal→Int (truncation not preserved) - Float→Int / Float64→Float32 (precision loss) - Date↔Timestamp (many→one) - Dictionary↔Utf8 (dictionary key semantics) ### Impact The optimizer is silently producing wrong query results. There is no error or warning — the plan looks correct and the query returns results that are simply semantically wrong. This has existed since the feature was introduced. ### Proposed fix Replace the default-allow logic with a closed-by-default allowlist. Only cast patterns proven safe (with literal round-trip verification) should be eligible. Implementation in https://github.com/apache/datafusion/pull/21908. - Closed-by-default: no cast is unwrapped unless explicitly allowlisted - Current allowlist: same-timezone timestamp precision widening only - Require exact literal round-trip: lit to source_type and back must equal the original - All other casts (int widenings, string to int, decimals, dictionaries) intentionally kept in plan ### To Reproduce See `datafusion/sqllogictest/test_files/unwrap_cast.slt` or the unit tests in `datafusion/expr-common/src/casts.rs`, `datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs`, `datafusion/physical-expr/src/simplifier/unwrap_cast.rs`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
