[ https://issues.apache.org/jira/browse/SPARK-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384604#comment-15384604 ]
Apache Spark commented on SPARK-15705: -------------------------------------- User 'yhuai' has created a pull request for this issue: https://github.com/apache/spark/pull/14267 > Spark won't read ORC schema from metastore for partitioned tables > ----------------------------------------------------------------- > > Key: SPARK-15705 > URL: https://issues.apache.org/jira/browse/SPARK-15705 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Environment: HDP 2.3.4 (Hive 1.2.1, Hadoop 2.7.1) > Reporter: Nic Eggert > Assignee: Yin Huai > Priority: Critical > > Spark does not seem to read the schema from the Hive metastore for > partitioned tables stored as ORC files. It appears to read the schema from > the files themselves, which, if they were created with Hive, does not match > the metastore schema (at least not before before Hive 2.0, see HIVE-4243). To > reproduce: > In Hive: > {code} > hive> create table default.test (id BIGINT, name STRING) partitioned by > (state STRING) stored as orc; > hive> insert into table default.test partition (state="CA") values (1, > "mike"), (2, "steve"), (3, "bill"); > {code} > In Spark > {code} > scala> spark.table("default.test").printSchema > {code} > Expected result: Spark should preserve the column names that were defined in > Hive. > Actual Result: > {code} > root > |-- _col0: long (nullable = true) > |-- _col1: string (nullable = true) > |-- state: string (nullable = true) > {code} > Possibly related to SPARK-14959? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org