[ https://issues.apache.org/jira/browse/AVRO-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093220#comment-17093220 ]
Thorsten Hake commented on AVRO-2647: ------------------------------------- I see your point. I guess the given schema really seems to be ambiguous. Sorry for my confusion. But the problem also exists when schemas are not ambiguous. Consider the following example: {code:java} { "type": "record", "name": "R", "fields": [ { "name": "F", "type": { "type": "array", "items": ["null","string"] }, "default": [null,"A"] } ] } {code} The current java implementation considers the default value as not being compatible with the type of the array. Should this be tracked in a separate issue? > specification does not fully specify semantics for unions in default types > -------------------------------------------------------------------------- > > Key: AVRO-2647 > URL: https://issues.apache.org/jira/browse/AVRO-2647 > Project: Apache Avro > Issue Type: Bug > Components: doc > Reporter: Roger > Priority: Minor > > Currently the specification does not make the semantics clear for union types > within complex types clear. In particular, the spec talks about union fields, > but leaves the semantics for unions in other contexts unspecified. > Here's an example which is undefined according to the current specification: > {code:json} > { > "type": "record", > "name": "R", > "fields": [ > { > "name": "F", > "type": { > "type": "array", > "items": [ > { > "type": "enum", > "name": "E1", > "symbols": ["A", "B"] > }, > { > "type": "enum", > "name": "E2", > "symbols": ["B", "A", "C"] > } > ] > }, > "default": ["A", "B", "C"] > } > ] > } > {code} > By experiment, most implementations seem to have chosen the semantics that > are documented in this PR. > In Java, the schema above is parsed without error, but when attempting to use > the default value, it fails with a NullPointerException (trying to find the > symbol C in E1). (Thanks for Ryan Skraba for this). > In [gogen-avro|https://github.com/actgardner/gogen-avro] it generates invalid > code because it's assuming E1 but generating the symbol for "C" anyway. > FWIW at some point in the future, I believe that it would be nice to align > the default value specification with the JSON encoding for Avro so there > aren't two subtly different JSON encodings of an Avro value. -- This message was sent by Atlassian Jira (v8.3.4#803005)