[ https://issues.apache.org/jira/browse/SPARK-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-18355: ------------------------------------ Assignee: (was: Apache Spark) > Spark SQL fails to read data from a ORC hive table that has a new column > added to it > ------------------------------------------------------------------------------------ > > Key: SPARK-18355 > URL: https://issues.apache.org/jira/browse/SPARK-18355 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.2, 2.1.0, 2.2.0 > Environment: Centos6 > Reporter: Sandeep Nemuri > > *PROBLEM*: > Spark SQL fails to read data from a ORC hive table that has a new column > added to it. > Below is the exception: > {code} > scala> sqlContext.sql("select click_id,search_id from testorc").show > 16/11/03 22:17:53 INFO ParseDriver: Parsing command: select > click_id,search_id from testorc > 16/11/03 22:17:54 INFO ParseDriver: Parse Completed > java.lang.AssertionError: assertion failed > at scala.Predef$.assert(Predef.scala:165) > at > org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39) > at > org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38) > at scala.Option.map(Option.scala:145) > at > org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) > at > org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > {code} > *STEPS TO SIMULATE THIS ISSUE*: > 1) Create table in hive. > {code} > CREATE TABLE `testorc`( > `click_id` string, > `search_id` string, > `uid` bigint) > PARTITIONED BY ( > `ts` string, > `hour` string) > STORED AS ORC; > {code} > 2) Load data into table : > {code} > INSERT INTO TABLE testorc PARTITION (ts = '98765',hour = '01' ) VALUES > (12,2,12345); > {code} > 3) Select through spark shell (This works) > {code} > sqlContext.sql("select click_id,search_id from testorc").show > {code} > 4) Now add column to hive table > {code} > ALTER TABLE testorc ADD COLUMNS (dummy string); > {code} > 5) Now again select from spark shell > {code} > scala> sqlContext.sql("select click_id,search_id from testorc").show > 16/11/03 22:17:53 INFO ParseDriver: Parsing command: select > click_id,search_id from testorc > 16/11/03 22:17:54 INFO ParseDriver: Parse Completed > java.lang.AssertionError: assertion failed > at scala.Predef$.assert(Predef.scala:165) > at > org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39) > at > org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38) > at scala.Option.map(Option.scala:145) > at > org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) > at > org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) > at > org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org