Re: Schema change on Spark Hive (Parquet file format) table not working
Yes, I've found a number of problems with metadata management in Spark SQL. One core issue is SPARK-9764 https://issues.apache.org/jira/browse/SPARK-9764 . Related issues are SPARK-9342 https://issues.apache.org/jira/browse/SPARK-9342 , SPARK-9761 https://issues.apache.org/jira/browse/SPARK-9761 and SPARK-9762 https://issues.apache.org/jira/browse/SPARK-9762 . I've also observed a case where, after an exception in ALTER TABLE, Spark SQL thought a table had 0 rows while, in fact, all the data was still there. I was not able to reproduce this one reliably so I did not create a JIRA issue for it. Let's vote for these issues and get them resolved. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Schema-change-on-Spark-Hive-Parquet-file-format-table-not-working-tp15360p24180.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Schema change on Spark Hive (Parquet file format) table not working
code snippet in short: hiveContext.sql(*CREATE EXTERNAL TABLE IF NOT EXISTS people_table (name String, age INT) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat' OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'*); hiveContext.sql(*INSERT INTO TABLE people_table SELECT name, age FROM temp_table_people1*); hiveContext.sql(*SELECT * FROM people_table*); ///Here, data read was successful./ hiveContext.sql(*ALTER TABLE people_table ADD COLUMNS (gender STRING)*); hiveContext.sql(*SELECT * FROM people_table*); ///Not able to read existing data and ArrayIndexOutOfBoundsException is thrown./ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Schema-change-on-Spark-Hive-Parquet-file-format-table-not-working-tp15360p15415.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org