Hi, I have one question related to Spark-Avro, not sure if here is the best place to ask. I have the following Scala Case class, populated with the data in the Spark application, and I tried to save it as AVRO format in the HDFS case class Claim ( ......) case class Coupon ( account_id: Long ........ claims: List[Claim]) As the above example, the Coupon case class contains List of Claim class. In the RDD, it holds an Iterator of Coupon data, and I will try to save it into the HDFS. I am using Spark 1.3.1, with Spark-Avro 1.0.0 (which matches with Spark 1.3.x) rdd.toDF.save("hdfs_location", "com.databricks.spark.avro") I have no problem to save the data this way, but the problem is that I cannot use the avro data in Hive. Here is the schema example generated by Spark AVRO for the above data: { "type":"record", "name":"topLevelRecord", "fields":[{ "name":"account_id", "type":"long" },........{"name":"claims", "type":[ { "type":"array", "items":[ { "type":"record", "name":"claims", "fields":[ ...... The claims field is generated as an union contains array, instead of array of structure directly.Or for more clearly, here is the schema in the hive when pointing to the data generated by Spark-Avro:desc tableOK col_name data_type comment account_id bigint from deserializer.......claims uniontype<array<uniontype<struct<account_id:bigint, .......>>> from deserializerObviously, this causes trouble for Hive to query this data (at least in the Hive 0.12, which we are currently use), so end user cannot query it in the hive like "select claims[0].account_id from table". I wonder why Spark-Avro has to wrapping a union structure in this case, instead of just building "array<struct>"?Or better, is there a way I can control the AVRO generated in this case in Spark-AVOR?ThanksYong