[ 
https://issues.apache.org/jira/browse/SPARK-36861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420549#comment-17420549
 ] 

Gengliang Wang edited comment on SPARK-36861 at 9/27/21, 8:06 AM:
------------------------------------------------------------------

Hmm, the PR https://github.com/apache/spark/pull/33709 is only on master. I 
can't reproduce your case on 3.2.0 RC4 with:

{code:scala}
> val df = Seq(("2021-01-01T00", 0), ("2021-01-01T01", 1), ("2021-01-01T02", 
> 2)).toDF("hour", "i")
> df.write.partitionBy("hour").parquet("/tmp/t1")
> spark.read.parquet("/tmp/t1").schema
res2: org.apache.spark.sql.types.StructType = 
StructType(StructField(i,IntegerType,true), StructField(hour,StringType,true))
{code}

The issue can be reproduced on Spark master though.



was (Author: gengliang.wang):
Hmm, the PR https://github.com/apache/spark/pull/33709 is only on master. I 
can't reproduce your case on RC4 with:

{code:scala}
> val df = Seq(("2021-01-01T00", 0), ("2021-01-01T01", 1), ("2021-01-01T02", 
> 2)).toDF("hour", "i")
> df.write.partitionBy("hour").parquet("/tmp/t1")
> spark.read.parquet("/tmp/t1").schema
res2: org.apache.spark.sql.types.StructType = 
StructType(StructField(i,IntegerType,true), StructField(hour,StringType,true))
{code}


> Partition columns are overly eagerly parsed as dates
> ----------------------------------------------------
>
>                 Key: SPARK-36861
>                 URL: https://issues.apache.org/jira/browse/SPARK-36861
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Tanel Kiis
>            Priority: Major
>
> I have an input directory with subdirs:
> * hour=2021-01-01T00
> * hour=2021-01-01T01
> * hour=2021-01-01T02
> * ...
> in spark 3.1 the 'hour' column is parsed as a string type, but in 3.2 RC it 
> is parsed as date type and the hour part is lost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to