Ben Roling created HIVE-5865:
--------------------------------

             Summary: AvroDeserializer incorrectly assumes keys to Maps will 
always be of type 'org.apache.avro.util.Utf8'
                 Key: HIVE-5865
                 URL: https://issues.apache.org/jira/browse/HIVE-5865
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.12.0, 0.11.0
            Reporter: Ben Roling


AvroDeserializer. deserializeMap() incorrectly assumes the type of they keys 
will always be 'org.apache.avro.util.Utf8'.  If the reader schema defines 
"avro.java.string"="String", this assumption does not hold, resulting in a 
ClassCastException.

I think a simple fix would be to define 'mapDatum' with type 
Map<CharSequence,Object> instead of Map<Utf8,Object>.  Assuming the key has the 
more general type of 'CharSequence' avoids the need to make an assumption of 
either String or Utf8.

I discovered the issue when using Hive 0.11.0.  Looking at the tags it is also 
there is in 0.12.0 and trunk:
https://github.com/apache/hive/blob/99f5bfcdf64330d062a30c0c9d83be1fbee54c34/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L313

The reason I saw this issue was because I pointed my Hive table to a schema 
file I populated based on pulling the schema from the SCHEMA$ attribute of an 
Avro generated Java class and I used stringType=String in the configuration of 
the avro-maven-plugin when generating my Java classes.

If I alter the schema my Hive table points to such that it doesn't have the 
"avro.java.string" attribute on my "map" type objects then queries work fine 
but if I leave those in there I get the ClassCastException anytime I try to 
query the table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to