> On July 13, 2016, 11:11 a.m., Chaoyu Tang wrote: > > Thanks [~yshi] for patch. It looks good. But I have a couple of questions: > > It seems to me that the union in existing code is only used to support > > Nullable type in Avro, and has not been fully supported as a data type in > > general. This patch actually extends (or adds) this type support. > > So with the patch, how can we be able to distinguish an Avro union between > > nullable and non-nullable, for example, for following field schema, both > > might end with type uniontype<int, bigint> > > {code} > > "fields":[ > > { > > "name":"value", > > "type":[ > > "null", > > "int", > > "long" > > ], > > "default":null > > ] > > --- > > "fields":[ > > { > > "name":"value", > > "type":[ > > "int", > > "long" > > ], > > "default": 0 > > ] > > {code} > > Will there be any problem? Also could we add some qtests using Avro union > > data (with or without null)?
Hi [~ctang], thanks for the review! Your concern about that both nullable and non-nullable avro union may end with same union type in Hive is very sound. My understanding is that every column in Hive is nullalbe (there isn't any key word like "not null" or "primary key" in Hive). As a result, schema ["null", "int", "long"] should always be used in favor of ["int", "long"]. The latter is supported by Hive just for better compatibility. So, it should be OK to map both ["null", "int", "long"] and ["int", "long"] to "uniontype<int,long>" Please let me know your opinions. I will try to add qtests as you suggested. - Yibing ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49952/#review141997 ----------------------------------------------------------- On July 12, 2016, 9:07 p.m., Yibing Shi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/49952/ > ----------------------------------------------------------- > > (Updated July 12, 2016, 9:07 p.m.) > > > Review request for hive and Chaoyu Tang. > > > Bugs: HIVE-14205 > https://issues.apache.org/jira/browse/HIVE-14205 > > > Repository: hive-git > > > Description > ------- > > HIVE-14205: Hive doesn't support union type with AVRO file format > > > Diffs > ----- > > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java > 6165138 > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java > 08ee62b > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java > 986b803 > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java > 0013b78 > > Diff: https://reviews.apache.org/r/49952/diff/ > > > Testing > ------- > > > Thanks, > > Yibing Shi > >