[ https://issues.apache.org/jira/browse/SPARK-9762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940776#comment-14940776 ]
Simeon Simeonov commented on SPARK-9762: ---------------------------------------- [~yhuai] the Hive compatibility section of the documentation should be updated to identify these cases. It is unfortunate to trust the docs only to discover a known lack of compatibility that was not documented. > ALTER TABLE cannot find column > ------------------------------ > > Key: SPARK-9762 > URL: https://issues.apache.org/jira/browse/SPARK-9762 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.4.1 > Environment: Ubuntu on AWS > Reporter: Simeon Simeonov > > {{ALTER TABLE tbl CHANGE}} cannot find a column that {{DESCRIBE COLUMN}} > lists. > In the case of a table generated with {{HiveContext.read.json()}}, the output > of {{DESCRIBE dimension_components}} is: > {code} > comp_config > struct<adText:string,adTextLeft:string,background:string,brand:string,button_color:string,cta_side:string,cta_type:string,depth:string,fixed_under:string,light:string,mid_text:string,oneline:string,overhang:string,shine:string,style:string,style_secondary:string,style_small:string,type:string> > comp_criteria string > comp_data_model string > comp_dimensions > struct<data:string,integrations:array<string>,template:string,variation:bigint> > comp_disabled boolean > comp_id bigint > comp_path string > comp_placementData struct<mod:string> > comp_slot_types array<string> > {code} > However, {{alter table dimension_components change comp_dimensions > comp_dimensions > struct<data:string,integrations:array<string>,template:string,variation:bigint,z:string>;}} > fails with: > {code} > 15/08/08 23:13:07 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Invalid column reference > comp_dimensions > at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3584) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:312) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) > at > org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:345) > at > org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:326) > at > org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:155) > at > org.apache.spark.sql.hive.client.ClientWrapper.runHive(ClientWrapper.scala:326) > at > org.apache.spark.sql.hive.client.ClientWrapper.runSqlHive(ClientWrapper.scala:316) > at > org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:473) > ... > {code} > Meanwhile, {{SHOW COLUMNS in dimension_components}} lists two columns: > {{col}} (which does not exist in the table) and {{z}}, which was just added. > This suggests that DDL operations in Spark SQL use table metadata > inconsistently. > Full spark-sql output > [here|https://gist.github.com/ssimeonov/636a25d6074a03aafa67]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org