L. C. Hsieh created SPARK-46502: ----------------------------------- Summary: Support timestamp types in UnwrapCastInBinaryComparison Key: SPARK-46502 URL: https://issues.apache.org/jira/browse/SPARK-46502 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: L. C. Hsieh
We have an optimization rule `UnwrapCastInBinaryComparison` that handles similar cases but it doesn't cover timestamp types. For a query plan like ``` == Analyzed Logical Plan == batch: timestamp Project [batch#26466] +- Filter (batch#26466 >= cast(2023-12-21 10:00:00 as timestamp)) +- SubqueryAlias spark_catalog.default.timestamp_view +- View (`spark_catalog`.`default`.`timestamp_view`, [batch#26466]) +- Project [cast(batch#26467 as timestamp) AS batch#26466] +- Project [cast(batch#26463 as timestamp) AS batch#26467] +- SubqueryAlias spark_catalog.default.table_timestamp +- Relation spark_catalog.default.table_timestamp[batch#26463] parquet == Optimized Logical Plan == Project [cast(batch#26463 as timestamp) AS batch#26466] +- Filter (isnotnull(batch#26463) AND (cast(batch#26463 as timestamp) >= 2023-12-21 10:00:00)) +- Relation spark_catalog.default.table_timestamp[batch#26463] parquet ``` The predicate compares a timestamp_ntz column with a literal value. As the column is wrapped in a cast expression to timestamp type, the literal (string) is wrapped with a cast to timestamp type. The literal with cast is foldable so it is evaluated to literal of timestamp earlier. So the predicate becomes `cast(batch#26463 as timestamp) >= 2023-12-21 10:00:00`. As the cast is in column side, it cannot be pushed down to data source/table. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org