[ https://issues.apache.org/jira/browse/SPARK-32611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sumeet updated SPARK-32611: --------------------------- Description: *How to reproduce this behavior?* * TZ="America/Los_Angeles" ./bin/spark-shell * sql("set spark.sql.hive.convertMetastoreOrc=true") * sql("set spark.sql.orc.impl=hive") * sql("create table t_spark(col timestamp) stored as orc;") * sql("insert into t_spark values (cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp));") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return empty results, which is incorrect.* * sql("set spark.sql.orc.impl=native") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* The above query using (True, hive) returns *correct results if pushdown filters are turned off*. * sql("set spark.sql.orc.filterPushdown=false") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* was: *How to reproduce this behavior?* * TZ="America/Los_Angeles" ./bin/spark-shell --conf spark.sql.catalogImplementation=hive * sql("set spark.sql.hive.convertMetastoreOrc=true") * sql("set spark.sql.orc.impl=hive") * sql("create table t_spark(col timestamp) stored as orc;") * sql("insert into t_spark values (cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp));") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return empty results, which is incorrect.* * sql("set spark.sql.orc.impl=native") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* The above query using (True, hive) returns *correct results if pushdown filters are turned off*. * sql("set spark.sql.orc.filterPushdown=false") * sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) *This will return 1 row, which is the expected output.* > Querying ORC table in Spark3 using spark.sql.orc.impl=hive produces incorrect > when timestamp is present in predicate > -------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-32611 > URL: https://issues.apache.org/jira/browse/SPARK-32611 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.0, 3.0.1 > Reporter: Sumeet > Priority: Major > > *How to reproduce this behavior?* > * TZ="America/Los_Angeles" ./bin/spark-shell > * sql("set spark.sql.hive.convertMetastoreOrc=true") > * sql("set spark.sql.orc.impl=hive") > * sql("create table t_spark(col timestamp) stored as orc;") > * sql("insert into t_spark values (cast('2100-01-01 > 01:33:33.123America/Los_Angeles' as timestamp));") > * sql("select col, date_format(col, 'DD') from t_spark where col = > cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) > *This will return empty results, which is incorrect.* > * sql("set spark.sql.orc.impl=native") > * sql("select col, date_format(col, 'DD') from t_spark where col = > cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) > *This will return 1 row, which is the expected output.* > > The above query using (True, hive) returns *correct results if pushdown > filters are turned off*. > * sql("set spark.sql.orc.filterPushdown=false") > * sql("select col, date_format(col, 'DD') from t_spark where col = > cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false) > *This will return 1 row, which is the expected output.* -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org