[jira] [Commented] (SPARK-11412) Support merge schema for ORC

Michael Armbrust (JIRA) Tue, 03 Nov 2015 03:35:31 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987127#comment-14987127
 ]


Michael Armbrust commented on SPARK-11412:
------------------------------------------

This is only currently supported for parquet.

> Support merge schema for ORC
> ----------------------------
>
>                 Key: SPARK-11412
>                 URL: https://issues.apache.org/jira/browse/SPARK-11412
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Dave
>
> when I tried to load partitioned orc files with a slight difference in a 
> nested column. say 
> column 
> -- request: struct (nullable = true)
>  |    |-- datetime: string (nullable = true)
>  |    |-- host: string (nullable = true)
>  |    |-- ip: string (nullable = true)
>  |    |-- referer: string (nullable = true)
>  |    |-- request_uri: string (nullable = true)
>  |    |-- uri: string (nullable = true)
>  |    |-- useragent: string (nullable = true)
> And then there's a page_url_lists attributes in the later partitions.
> I tried to use
> val s = sqlContext.read.format("orc").option("mergeSchema", 
> "true").load("/data/warehouse/xxxx") to load the data.
> But the schema doesn't show request.page_url_lists.
> I am wondering if schema merge doesn't work for orc?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-11412) Support merge schema for ORC

Reply via email to