Sumeet created SPARK-32611: ------------------------------ Summary: Querying ORC table in Spark3 using spark.sql.orc.impl=hive produces incorrect when timestamp is present in predicate Key: SPARK-32611 URL: https://issues.apache.org/jira/browse/SPARK-32611 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0, 3.0.1 Reporter: Sumeet
*How to reproduce this behavior?* * TZ="America/Los_Angeles" ./bin/spark-shell --conf spark.sql.catalogImplementation=hive * sql("set spark.sql.hive.convertMetastoreOrc=true") * sql("set spark.sql.orc.impl=hive") * sql("create table t_spark(col timestamp) stored as orc;") * sql("insert into t_spark values (cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp));") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return empty results, which is incorrect.* * sql("set spark.sql.orc.impl=native") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* The above query using (True, hive) returns *correct results if pushdown filters are turned off*. * sql("set spark.sql.orc.filterPushdown=false") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org