This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push: new b6e8f64 [SPARK-31284][SQL][TESTS] Check rebasing of timestamps in ORC datasource b6e8f64 is described below commit b6e8f64d49caf1f0a1f1b910d603e8e000270d01 Author: Maxim Gekk <max.g...@gmail.com> AuthorDate: Fri Mar 27 09:06:59 2020 -0700 [SPARK-31284][SQL][TESTS] Check rebasing of timestamps in ORC datasource ### What changes were proposed in this pull request? In the PR, I propose 2 tests to check that rebasing of timestamps from/to the hybrid calendar (Julian + Gregorian) to/from Proleptic Gregorian calendar works correctly. 1. The test `compatibility with Spark 2.4 in reading timestamps` load ORC file saved by Spark 2.4.5 via: ```shell $ export TZ="America/Los_Angeles" ``` ```scala scala> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles") scala> val df = Seq("1001-01-01 01:02:03.123456").toDF("tsS").select($"tsS".cast("timestamp").as("ts")) df: org.apache.spark.sql.DataFrame = [ts: timestamp] scala> df.write.orc("/Users/maxim/tmp/before_1582/2_4_5_ts_orc") scala> spark.read.orc("/Users/maxim/tmp/before_1582/2_4_5_ts_orc").show(false) +--------------------------+ |ts | +--------------------------+ |1001-01-01 01:02:03.123456| +--------------------------+ ``` 2. The test `rebasing timestamps in write` is round trip test. Since the previous test confirms correct rebasing of timestamps in read. This test should pass only if rebasing works correctly in write. ### Why are the changes needed? To guarantee that rebasing works correctly for timestamps in ORC datasource. ### Does this PR introduce any user-facing change? No ### How was this patch tested? By running `OrcSourceSuite` for Hive 1.2 and 2.3 via the commands: ``` $ build/sbt -Phive-2.3 "test:testOnly *OrcSourceSuite" ``` and ``` $ build/sbt -Phive-1.2 "test:testOnly *OrcSourceSuite" ``` Closes #28047 from MaxGekk/rebase-ts-orc-test. Authored-by: Maxim Gekk <max.g...@gmail.com> Signed-off-by: Dongjoon Hyun <dongj...@apache.org> (cherry picked from commit fc2a974e030c82bf500a81c3908f853c3eeb761d) Signed-off-by: Dongjoon Hyun <dongj...@apache.org> --- .../test-data/before_1582_ts_v2_4.snappy.orc | Bin 0 -> 251 bytes .../execution/datasources/orc/OrcSourceSuite.scala | 28 +++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc b/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc new file mode 100644 index 0000000..af9ef04 Binary files /dev/null and b/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc differ diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala index b5e002f..0b7500c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala @@ -508,6 +508,34 @@ abstract class OrcSuite extends OrcTest with BeforeAndAfterAll { } } } + + test("SPARK-31284: compatibility with Spark 2.4 in reading timestamps") { + Seq(false, true).foreach { vectorized => + withSQLConf(SQLConf.ORC_VECTORIZED_READER_ENABLED.key -> vectorized.toString) { + checkAnswer( + readResourceOrcFile("test-data/before_1582_ts_v2_4.snappy.orc"), + Row(java.sql.Timestamp.valueOf("1001-01-01 01:02:03.123456"))) + } + } + } + + test("SPARK-31284: rebasing timestamps in write") { + withTempPath { dir => + val path = dir.getAbsolutePath + Seq("1001-01-01 01:02:03.123456").toDF("tsS") + .select($"tsS".cast("timestamp").as("ts")) + .write + .orc(path) + + Seq(false, true).foreach { vectorized => + withSQLConf(SQLConf.ORC_VECTORIZED_READER_ENABLED.key -> vectorized.toString) { + checkAnswer( + spark.read.orc(path), + Row(java.sql.Timestamp.valueOf("1001-01-01 01:02:03.123456"))) + } + } + } + } } class OrcSourceSuite extends OrcSuite with SharedSparkSession { --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org