Maxim Gekk created SPARK-31183: ---------------------------------- Summary: Incompatible Avro dates/timestamps with Spark 2.4 Key: SPARK-31183 URL: https://issues.apache.org/jira/browse/SPARK-31183 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Maxim Gekk
Write dates/timestamps to Avro file in Spark 2.4.5: {code} $ export TZ="America/Los_Angeles" $ bin/spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.5 {code} {code:scala} scala> df.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +----------+ |date | +----------+ |1001-01-01| +----------+ scala> df2.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +--------------------------+ |ts | +--------------------------+ |1001-01-01 01:02:03.123456| +--------------------------+ {code} Spark 3.0.0-preview2 ( and 3.1.0-SNAPSHOT) outputs different values from Spark 2.4.5: {code} $ export TZ="America/Los_Angeles" $ /bin/spark-shell --packages org.apache.spark:spark-avro_2.12:2.4.5 {code} {code:scala} scala> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles") scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_date_avro").show(false) +----------+ |date | +----------+ |1001-01-07| +----------+ scala> spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false) +--------------------------+ |ts | +--------------------------+ |1001-01-07 01:09:05.123456| +--------------------------+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org