[ 
https://issues.apache.org/jira/browse/SPARK-11995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027423#comment-15027423
 ] 

Jack Arenas edited comment on SPARK-11995 at 11/25/15 7:24 PM:
---------------------------------------------------------------

Seems like the issue comes from CatalystSchemaConverter.scala because DateType 
is only ever parsed from an INT32 and reading from a partition column may 
change the type to binary (I'm guessing) which means adding:

case DATE => DateType

after line 171 might do the trick. Investigating now.


was (Author: jackar):
Seems like the issue comes from CatalystSchemaConverter.scala because DateType 
is only ever parsed from an INT32 and reading from a partition column may 
change the type to binary (I'm guessing) which means adding 

    case DATE => DateType

after line 171 might do the trick. Investigating now.

> Partitioning Parquet by DateType
> --------------------------------
>
>                 Key: SPARK-11995
>                 URL: https://issues.apache.org/jira/browse/SPARK-11995
>             Project: Spark
>          Issue Type: Improvement
>    Affects Versions: 1.5.2
>            Reporter: Jack Arenas
>            Priority: Minor
>
> ... After writing to s3 and partitioning by a DateType column, reads on the 
> parquet "table" (i.e. s3n://s3_bucket_url/table where date partitions break 
> the table into date-based s3n://s3_bucket_url/table/date=2015-11-25 chunks) 
> will show the partitioned date column as a StringType...
> https://github.com/databricks/spark-redshift/issues/122



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to