[ https://issues.apache.org/jira/browse/AVRO-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795200#action_12795200 ]
Doug Cutting commented on AVRO-248: ----------------------------------- Todd> and then use ["UserId", "ProductId"] with some way to distinguish between the two. In Avro a record with a single integer field is the same size as an integer, and then you can use multiple records in a union. This seems nearly isomorphic, since at runtime you'd need a wrapper to distinguish the two branches anyway, no? Hypothetically, we could permit only records in unions. That would name branches, but be inconvenient. It might also be non-pythonic. At the other extreme, to be pythonic, we could name nothing, and instead wrap things in named tags when we want to use them in a union. What we currently have is something in the middle: some things are permitted in unions without wrappers (e.g., importantly, null) while other distinctions require an explicit record-based wrapper. Adding another layer of naming seems perhaps excessive. > make unions a named type > ------------------------ > > Key: AVRO-248 > URL: https://issues.apache.org/jira/browse/AVRO-248 > Project: Avro > Issue Type: New Feature > Components: spec > Reporter: Doug Cutting > Fix For: 1.3.0 > > > Unions are currently anonymous. However it might be convenient if they were > named. In particular: > - when code is generated for a union, a class could be generated that > includes an enum indicating which branch of the union is taken, e.g., a union > of string and int named Foo might cause a Java class like {code} > public class Foo { > public static enum Type {STRING, INT}; > private Type type; > private Object datum; > public Type getType(); > public String getString() { if (type==STRING) return (String)datum; else > throw ... } > public void setString(String s) { type = STRING; datum = s; } > .... > } > {code} Then Java applications can easily use a switch statement to process > union values rather than using instanceof. > - when using reflection, an abstract class with a set of concrete > implementations can be represented as a union (AVRO-241). However, if one > wishes to create an array one must know the name of the base class, which is > not represented in the Avro schema. One approach would be to add an > annotation to the reflected array schema (AVRO-242) noting the base class. > But if the union itself were named, that could name the base class. This > would also make reflected protocol interfaces more consise, since the base > class name could be used in parameters return types and fields. > - Generalizing the above: Avro lacks class inheritance, unions are a way to > model inheritance, and this model is more useful if the union is named. > This would be an incompatible change to schemas. If we go this way, we > should probably rename 1.3 to 2.0. Note that AVRO-160 proposes an > incompatible change to data file formats, which may also force a major > release. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.