Re: SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection

2014-11-27 Thread Kelly, Jonathan
apache.org>" mailto:user@spark.apache.org>> Subject: Re: SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection Hello Jonathan, There was a bug regarding casting data types before inserting into a Hive table. Hive does not have

Re: SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection

2014-11-26 Thread Yin Huai
Hello Jonathan, There was a bug regarding casting data types before inserting into a Hive table. Hive does not have the notion of "containsNull" for array values. So, for a Hive table, the containsNull will be always true for an array and we should ignore this field for Hive. This issue has been f

Re: SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection

2014-11-26 Thread Kelly, Jonathan
After playing around with this a little more, I discovered that: 1. If test.json contains something like {"values":[null,1,2,3]}, the schema auto-determined by SchemaRDD.jsonFile() will have "element: integer (containsNull = true)", and then SchemaRDD.saveAsTable()/SchemaRDD.insertInto() will work

SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection

2014-11-26 Thread Kelly, Jonathan
I've noticed some strange behavior when I try to use SchemaRDD.saveAsTable() with a SchemaRDD that I¹ve loaded from a JSON file that contains elements with nested arrays. For example, with a file test.json that contains the single line: {"values":[1,2,3]} and with code like the following