[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973615#comment-13973615 ]
Anthony Hsu commented on HIVE-6835: ----------------------------------- What happens is Hive tries to build ObjectInspectorConverters from the partition schema to the table schema. If the partition schema is different from the table schema, you may get a ClassCastException like above. When you add new columns at the end, this is not a problem because these new columns are chopped off. See ObjectInspectorConverters:StructConverter: {code} int minFields = Math.min(inputFields.size(), outputFields.size()); fieldConverters = new ArrayList<Converter>(minFields); {code} It's only when you insert new columns at the beginning or in the middle that you might run into ClassCastExceptions. For the AvroSerDe, if it always uses the latest schema (which should be the table-level schema), Hive will not get confused when constructing its ObjectInspectorConverters. Then, later, when the AvroSerDe actually goes to read the Avro files, it can compare the latest schema with the (possibly old) schemas stored in the Avro data files themselves, and do the proper schema resolution, omitting fields or substituting default values, following the [schema resolution rules|http://avro.apache.org/docs/current/spec.html#Schema+Resolution]. > Reading of partitioned Avro data fails if partition schema does not match > table schema > -------------------------------------------------------------------------------------- > > Key: HIVE-6835 > URL: https://issues.apache.org/jira/browse/HIVE-6835 > Project: Hive > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Anthony Hsu > Assignee: Anthony Hsu > Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch > > > To reproduce: > {code} > create table testarray (a array<string>); > load data local inpath '/home/ahsu/test/array.txt' into table testarray; > # create partitioned Avro table with one array column > create table avroarray partitioned by (y string) row format serde > 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties > ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": > "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} > } ] }') STORED as INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; > insert into table avroarray partition(y=1) select * from testarray; > # add an int column with a default value of 0 > alter table avroarray set serde > 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with > serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": > "record", "fields": [ {"name":"intfield","type":"int","default":0},{ > "name":"a", "type":{"type":"array","items":"string"} } ] }'); > # fails with ClassCastException > select * from avroarray; > {code} > The select * fails with: > {code} > Failed with exception java.io.IOException:java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector > cannot be cast to > org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)