[ https://issues.apache.org/jira/browse/AVRO-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sachin Goyal updated AVRO-1562: ------------------------------- Status: Patch Available (was: Open) Here is the first patch for this issue. Comments welcome! Diff created using 'svn diff' from the trunk > Add support for types extending Maps/Collections > ------------------------------------------------ > > Key: AVRO-1562 > URL: https://issues.apache.org/jira/browse/AVRO-1562 > Project: Avro > Issue Type: Bug > Affects Versions: 1.7.6 > Reporter: Sachin Goyal > Attachments: custom_map_and_collections1.patch > > > Consider the following code: > {code} > import java.io.ByteArrayOutputStream; > import java.util.*; > import org.apache.avro.Schema; > import org.apache.avro.file.DataFileWriter; > import org.apache.avro.reflect.ReflectData; > import org.apache.avro.reflect.ReflectDatumWriter; > public class AvroDerivingMaps > { > public static void main (String [] args) throws Exception > { > MapDerivedContainer orig = new MapDerivedContainer(); > ReflectData rdata = ReflectData.AllowNull.get(); > Schema schema = rdata.getSchema(MapDerivedContainer.class); > System.out.println(schema); > > ReflectDatumWriter<MapDerivedContainer> datumWriter = new > ReflectDatumWriter (MapDerivedContainer.class, rdata); > DataFileWriter<MapDerivedContainer> fileWriter = new > DataFileWriter<MapDerivedContainer> (datumWriter); > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > fileWriter.create(schema, baos); > fileWriter.append(orig); > fileWriter.close(); > } > } > class MapDerived extends HashMap<String, Integer> > { > Integer a = 1; > String b = "b"; > } > class MapDerivedContainer > { > MapDerived2 map = new MapDerived2(); > } > class MapDerived2 extends MapDerived > { > String c = "c"; > } > {code} > \\ > \\ > It throws the following exception: > {code:javascript} > {"type":"record","name":"MapDerivedContainer","namespace":"avro","fields":[{"name":"map","type":["null",{"type":"record","name":"MapDerived2","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}],"default":null}]} > {code} > {color:brown} > Exception in thread "main" > org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.UnresolvedUnionException: > Caused by: org.apache.avro.UnresolvedUnionException: Not in union > ["null",{"type":"record","name":"MapDerived2","namespace":"avro","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}]: > {} > at > org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:600) > at > org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145) > at > org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114) > at > org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:203) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290) > ... 1 more > {color} > \\ > \\ > It appears that ReflectData#createSchema() checks for "type instanceof > ParameterizedType" and because of this, it skips handling of the map. > The same is not true of GenericData#isMap() and GenericData#resolveUnion() > fails because of this. > The same may be true for classes extending ArrayList, Collection, Set etc. > Also, note the schema for the class extending Map: > {code:javascript} > { > "type":"record", > "name":"MapDerived2", > "fields":[ > { > "name":"c", > "type":[ > "null", > "string" > ], > "default":null > }, > { > "name":"a", > "type":[ > "null", > "int" > ], > "default":null > }, > { > "name":"b", > "type":[ > "null", > "string" > ], > "default":null > } > ] > } > {code} > This schema ignores the Map completely. > Probably, for such a class, the schema should look like: > {code:javascript} > { > "type":"record", > "name":"MapDerived2", > "fields":[ > { > "name":"c", > "type":[ > "null", > "string" > ], > "default":null > }, > .... // Other fields in the class extending the Map > { > "name":"BASE_MAP", > "type":[ > "null", > "map" ... // Normal map which the class extends (implements?) > ], > "default":null > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)