The use case I will definitely need is one where a legacy class and a new class 
share the same Avro schema.  These two classes have  to coexist for a while, 
sometimes in the same process.  The legacy class maps various Java types to 
avro types differently (String, List, Date are some more obvious ones) and 
cannot be changed.   In the long run the new class will be used exclusively, 
which can do things like lazily switch from Utf8 to transient String once an 
operation is requested that needs String and on serialization knows how to deal 
with updating persistent types appropriately.

I don't know what the best way to support that is, but it is important IMO to 
have a way to control how the default types map to Java classes via reflection 
and have a couple built-in options for the most common variations.  Subclassing 
works, a pluggable type map probably would as well.


On 8/6/09 9:28 PM, "Doug Cutting" <[email protected]> wrote:

tazan007 wrote:
> Yea makes sense.  I am trying to go from Java to Avro :D so I ended up
> overriding getArraySize, getArrayElements, writeRecord, and writeString
> from ReflectDatumWriter so I could have it convert String to Utf8, Date
> to Long, and List to GenericArray.

Please consider contributing this.  File an issue in Jira and attach the
patch.

http://wiki.apache.org/hadoop/Avro/HowToContribute

> Also had to make some changes to ReflectData's createSchema to support
> String and Lists  and changed RecordSchema's fieldsToJson to translate
> Date objects to long.  Basically a cluge but works for getting data out
> of my Java objects into Avro objects.

Perhaps we should make ReflectData extensible, so that subclasses can
determine the implementation types?

Doug

Reply via email to