Bruce Robbins created SPARK-31238: ------------------------------------- Summary: Incompatible ORC dates with Spark 2.4 Key: SPARK-31238 URL: https://issues.apache.org/jira/browse/SPARK-31238 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0 Reporter: Bruce Robbins
Using Spark 2.4.5, write pre-1582 date to ORC file and then read it: {noformat} $ export TZ=UTC $ bin/spark-shell --conf spark.sql.session.timeZone=UTC Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.5-SNAPSHOT /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161) Type in expressions to have them evaluated. Type :help for more information. scala> sql("select cast('1200-01-01' as date) dt").write.mode("overwrite").orc("/tmp/datefile") scala> spark.read.orc("/tmp/datefile").show +----------+ |dt | +----------+ |1200-01-01| +----------+ scala> :quit {noformat} Using Spark 3.0 (branch-3.0 at commit a934142f24), read the same file: {noformat} $ export TZ=UTC $ bin/spark-shell --conf spark.sql.session.timeZone=UTC Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT /_/ Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161) Type in expressions to have them evaluated. Type :help for more information. scala> spark.read.orc("/tmp/datefile").show +----------+ |dt | +----------+ |1200-01-08| +----------+ scala> {noformat} Dates are off. Timestamps, on the other hand, appear to work as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org