[ https://issues.apache.org/jira/browse/SPARK-34259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275666#comment-17275666 ]
Apache Spark commented on SPARK-34259: -------------------------------------- User 'd80tb7' has created a pull request for this issue: https://github.com/apache/spark/pull/31399 > Reading a partitioned dataset with a partition value of NOW causes the value > to be parsed as a timestamp. > --------------------------------------------------------------------------------------------------------- > > Key: SPARK-34259 > URL: https://issues.apache.org/jira/browse/SPARK-34259 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.1 > Reporter: Chris Martin > Priority: Minor > > *Problem* > Reading a partitioned dataset where one of the column values matches a > special timestamp (NOW, TODAY etc) causes the value to be interpreted as a > timestamp rather than a string. > *Example Code (Scala)* > {code:java} > import org.apache.spark.sql.SparkSession > import org.apache.spark.sql.functions._ > object TestBug { > def main(args: Array[String]): Unit = { > val spark = SparkSession.builder().master("local[*]").getOrCreate() > val df = spark.range(1, 2).withColumn("partition", lit("NOW")) > df.write.mode("overwrite").partitionBy("partition").parquet("bug") > > spark.read.parquet("bug").show(truncate = false) > } > } > {code} > The above program prints out: > {noformat} > +---+--------------------------+ > |id |partition | > +---+--------------------------+ > |1 |2021-01-27 08:53:23.650039| > +---+--------------------------+ > {noformat} > > *Analysis* > This happens because in PartitioningUtils.inferPartitionColumnValue we try > and cast the partition value to a timestamp in order to determine if > timestamp is a valid interpretation. As NOW etc are literals which are valid > to cast to timestamps, the code ends up as interpreting the value as a > timestamp. > I think what we want to do here is change > PartitioningUtils.inferPartitionColumnValue so that when it attempts to > interpret as timestamp we ignore the special values. This looks difficult to > do if we continue to use cast, but one other option is to add an option to > DateTimeUtils.stringToDate to tell it to ignore special values and instead > use that to do the conversion. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org