[jira] [Commented] (SPARK-15705) Spark won't read ORC schema from metastore for partitioned tables

Apache Spark (JIRA) Tue, 19 Jul 2016 11:03:12 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384604#comment-15384604
 ]


Apache Spark commented on SPARK-15705:
--------------------------------------

User 'yhuai' has created a pull request for this issue:
https://github.com/apache/spark/pull/14267

> Spark won't read ORC schema from metastore for partitioned tables
> -----------------------------------------------------------------
>
>                 Key: SPARK-15705
>                 URL: https://issues.apache.org/jira/browse/SPARK-15705
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: HDP 2.3.4 (Hive 1.2.1, Hadoop 2.7.1)
>            Reporter: Nic Eggert
>            Assignee: Yin Huai
>            Priority: Critical
>
> Spark does not seem to read the schema from the Hive metastore for 
> partitioned tables stored as ORC files. It appears to read the schema from 
> the files themselves, which, if they were created with Hive, does not match 
> the metastore schema (at least not before before Hive 2.0, see HIVE-4243). To 
> reproduce:
> In Hive:
> {code}
> hive> create table default.test (id BIGINT, name STRING) partitioned by 
> (state STRING) stored as orc;
> hive> insert into table default.test partition (state="CA") values (1, 
> "mike"), (2, "steve"), (3, "bill");
> {code}
> In Spark
> {code}
> scala> spark.table("default.test").printSchema
> {code}
> Expected result: Spark should preserve the column names that were defined in 
> Hive.
> Actual Result:
> {code}
> root
>  |-- _col0: long (nullable = true)
>  |-- _col1: string (nullable = true)
>  |-- state: string (nullable = true)
> {code}
> Possibly related to SPARK-14959?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-15705) Spark won't read ORC schema from metastore for partitioned tables

Reply via email to