[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102525#comment-17102525 ]
Apache Spark commented on SPARK-31662: -------------------------------------- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/28479 > Reading wrong dates from dictionary encoded columns in Parquet files > -------------------------------------------------------------------- > > Key: SPARK-31662 > URL: https://issues.apache.org/jira/browse/SPARK-31662 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.0.0, 3.1.0 > Reporter: Maxim Gekk > Priority: Major > > Write dates with dictionary encoding enabled to parquet files: > {code:scala} > Welcome to > ____ __ > / __/__ ___ _____/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /___/ .__/\_,_/_/ /_/\_\ version 3.1.0-SNAPSHOT > /_/ > Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_242) > Type in expressions to have them evaluated. > Type :help for more information. > scala> > spark.conf.set("spark.sql.legacy.parquet.rebaseDateTimeInWrite.enabled", true) > scala> :paste > // Entering paste mode (ctrl-D to finish) > Seq.tabulate(8)(_ => "1001-01-01").toDF("dateS") > .select($"dateS".cast("date").as("date")) > .repartition(1) > .write > .option("parquet.enable.dictionary", true) > .mode("overwrite") > .parquet("/Users/maximgekk/tmp/parquet-date-dict") > // Exiting paste mode, now interpreting. > {code} > Read them back: > {code:scala} > scala> > spark.read.parquet("/Users/maximgekk/tmp/parquet-date-dict").show(false) > +----------+ > |date | > +----------+ > |1001-01-07| > |1001-01-07| > |1001-01-07| > |1001-01-07| > |1001-01-07| > |1001-01-07| > |1001-01-07| > |1001-01-07| > +----------+ > {code} > *Expected values must be 1000-01-01.* -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org