Hi All, I wrote out a complex parquet file from spark sql and now I am trying to put a hive table on top. I am running into issues with creating the hive table itself. Here is the json that I wrote out to parquet using spark sql: {"user_id":"4513","providers":[{"id":"4220","name":"dbmvl","behaviors":{"b1":"gxybq","b2":"ntfmx"}},{"id":"4173","name":"dvjke","behaviors":{"b1":"sizow","b2":"knuuc"}}]} {"user_id":"3960","providers":[{"id":"1859","name":"ponsv","behaviors":{"b1":"ahfgc","b2":"txpea"}},{"id":"103","name":"uhqqo","behaviors":{"b1":"lktyo","b2":"ituxy"}}]} {"user_id":"567","providers":[{"id":"9622","name":"crjju","behaviors":{"b1":"rhaqc","b2":"npnot"}},{"id":"6965","name":"fnheh","behaviors":{"b1":"eipse","b2":"nvxqk"}}]}
I basically created a hive context and read in the json file using jsonFile and then I wrote it back out using saveAsParquetFile. Afterwards I was trying to create a hive table on top of the parquet file. Here is the hive hql that I have: create table test (mycol STRUCT<user_id:String, providers:ARRAY<STRUCT<id:String, name:String, behaviors:MAP<String, String>>>>) stored as parquet; Alter table test set location 'hdfs:///tmp/test.parquet'; I get errors when I try to do a select * on the table: Failed with exception java.io.IOException:java.lang.IllegalStateException: Column mycol at index 0 does not exist in {providers=providers, user_id=user_id} -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Creating-a-hive-table-on-top-of-a-parquet-file-written-out-by-spark-tp22084.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org