Hello!

I am serializing Avro GenericRecords with Hadoop MapReduce, and they are
generally pretty large (more than 200 fields at times). When calling
context.write() in the Mapper/Reducer for the job, sometimes I am met with
an UnresolvedUnionException like so:

java.lang.Exception:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union ["null","int"]: 123

Sometimes, it's possible to guess which field was being serialized when
this exception is thrown, but I find that most of the time I have no idea
for which field it is occurring! It could be any of 200+ fields which are
specified in the Schema for the GenericRecord as being a ["null", "int"],
but it is very difficult to narrow it down.

It would be great if the exception could say something like:

java.lang.Exception:
org.apache.avro.file.DataFileWriter$AppendWriteException: When writing
field 'someField': org.apache.avro.UnresolvedUnionException: Not in union
["null","int"]: 123

I don't personally have any example code that can easily reproduce this,
but borrowing the code in AVRO-966 (
https://issues.apache.org/jira/browse/AVRO-966) I also see this unclear
error message turning up.

Is there a way to achieve this easily? Or, would this have to be a new Avro
feature, and if so, is it possible? I admit that I don't have a super
in-depth knowledge of the inner workings of Avro.

Thank you!

Reply via email to