The use case I will definitely need is one where a legacy class and a new class share the same Avro schema. These two classes have to coexist for a while, sometimes in the same process. The legacy class maps various Java types to avro types differently (String, List, Date are some more obvious ones) and cannot be changed. In the long run the new class will be used exclusively, which can do things like lazily switch from Utf8 to transient String once an operation is requested that needs String and on serialization knows how to deal with updating persistent types appropriately.
I don't know what the best way to support that is, but it is important IMO to have a way to control how the default types map to Java classes via reflection and have a couple built-in options for the most common variations. Subclassing works, a pluggable type map probably would as well. On 8/6/09 9:28 PM, "Doug Cutting" <[email protected]> wrote: tazan007 wrote: > Yea makes sense. I am trying to go from Java to Avro :D so I ended up > overriding getArraySize, getArrayElements, writeRecord, and writeString > from ReflectDatumWriter so I could have it convert String to Utf8, Date > to Long, and List to GenericArray. Please consider contributing this. File an issue in Jira and attach the patch. http://wiki.apache.org/hadoop/Avro/HowToContribute > Also had to make some changes to ReflectData's createSchema to support > String and Lists and changed RecordSchema's fieldsToJson to translate > Date objects to long. Basically a cluge but works for getting data out > of my Java objects into Avro objects. Perhaps we should make ReflectData extensible, so that subclasses can determine the implementation types? Doug
