[ https://issues.apache.org/jira/browse/SPARK-22165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan reassigned SPARK-22165: ----------------------------------- Assignee: Hyukjin Kwon > Type conflicts between dates, timestamps and date in partition column > --------------------------------------------------------------------- > > Key: SPARK-22165 > URL: https://issues.apache.org/jira/browse/SPARK-22165 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.1, 2.2.0, 2.3.0 > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Minor > Fix For: 2.3.0 > > > It looks we have some bugs when resolving type conflicts in partition column. > I found few corner cases as below: > Case 1: timestamp should be inferred but date type is inferred. > {code} > val df = Seq((1, "2015-01-01"), (2, "2016-01-01 00:00:00")).toDF("i", "ts") > df.write.format("parquet").partitionBy("ts").save("/tmp/foo") > spark.read.load("/tmp/foo").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- ts: date (nullable = true) > {code} > Case 2: decimal should be inferred but integer is inferred. > {code} > val df = Seq((1, "1"), (2, "1" * 30)).toDF("i", "decimal") > df.write.format("parquet").partitionBy("decimal").save("/tmp/bar") > spark.read.load("/tmp/bar").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- decimal: integer (nullable = true) > {code} > Looks we should de-duplicate type resolution logic if possible rather than > separate numeric precedence-like comparison alone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org