I don't have a specific use-class that is problematic, but was trying to
understand how it all works internally. Following your comment about indexes I
looked in GenericDatumWriter and sure enough the union is tagged so we know
which part of the union was written:
case UNION:
int index = data.resolveUnion(schema, datum);
out.writeIndex(index);
write(schema.getTypes().get(index), datum, out);
break;
That's the bit I was missing! Thanks for the input.
Andrew
From: Harsh J ha...@cloudera.com
To: user@avro.apache.org; Andrew Kenworthy adwkenwor...@yahoo.com
Sent: Monday, January 23, 2012 4:04 PM
Subject: Re: How does Avro mark (string) field delimition?
The read part is empty as well, when the decoder is asked to read a
'null' type. For null carrying unions, I believe an index is written
out so if the index evals to a null, the same logic works yet again.
Does not matter if there are two nulls adjacent to one another,
therefore. How do you imagine this ends up being a problem? What
trouble are you running into?
On Mon, Jan 23, 2012 at 8:08 PM, Andrew Kenworthy
adwkenwor...@yahoo.com wrote:
I have looked at the Avro 1.6.0 code and am not sure how Avro distinguishes
between field boundaries when reading null values.
The BinaryEncoder class (which is where I land when debugging my code) has
an empty method for writeNull: how does the parser then distinguuish between
adjacent nullable fields when reading that data?
Thanks in advance,
Andrew
--
Harsh J
Customer Ops. Engineer, Cloudera