[ 
https://issues.apache.org/jira/browse/AVRO-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784392#action_12784392
 ] 

Philip Zeyliger commented on AVRO-248:
--------------------------------------

I could go both ways.  

(Yes, names!) Say we had a "host" record. That might be a union of "hostname" 
(string) or "IP address" (a record).  I would rather see the code say 
getHostname() rather than getString().  You can get around this by creating a 
Hostname record, but then it would be getHostname().getHostname(), since 
records always have field names.  The restriction to contain only one branch of 
any unnamed type could be relaxed.  Sometimes the "type" of two things is the 
same, and should be a primitive type, but they're different.  I'm struggling to 
come up with a great example, but perhaps a date could be expressed as "days 
since 1900" (Excel style) or "days since the epoch".  Both are ints.

(No names!) In Java, half the time the names of fields are boring.  Fields are 
called "outputStream" and have type "OutputStream", and, really, did we need 
both?

In Avro's case, especially because unions are the way to implement nullable 
fields, name-less is pretty convincing.

> make unions a named type
> ------------------------
>
>                 Key: AVRO-248
>                 URL: https://issues.apache.org/jira/browse/AVRO-248
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.3.0
>
>
> Unions are currently anonymous.  However it might be convenient if they were 
> named.  In particular:
>  - when code is generated for a union, a class could be generated that 
> includes an enum indicating which branch of the union is taken, e.g., a union 
> of string and int named Foo might cause a Java class like {code}
> public class Foo {
>   public static enum Type {STRING, INT};
>   private Type type;
>   private Object datum;
>   public Type getType();
>   public String getString() { if (type==STRING) return (String)datum; else 
> throw ... }
>   public void setString(String s) { type = STRING;  datum = s; }
>   ....
> }
> {code} Then Java applications can easily use a switch statement to process 
> union values rather than using instanceof.
>  - when using reflection, an abstract class with a set of concrete 
> implementations can be represented as a union (AVRO-241).  However, if one 
> wishes to create an array one must know the name of the base class, which is 
> not represented in the Avro schema.  One approach would be to add an 
> annotation to the reflected array schema (AVRO-242) noting the base class.  
> But if the union itself were named, that could name the base class.  This 
> would also make reflected protocol interfaces more consise, since the base 
> class name could be used in parameters return types and fields.
>  - Generalizing the above: Avro lacks class inheritance, unions are a way to 
> model inheritance, and this model is more useful if the union is named.
> This would be an incompatible change to schemas.  If we go this way, we 
> should probably rename 1.3 to 2.0.  Note that AVRO-160 proposes an 
> incompatible change to data file formats, which may also force a major 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to