Hi Elliot,

Thanks for that bit of info. It is helpful. Where do you draw the line between 
complex unions versus simple unions? In other words, what criteria do you use 
to say this union is too complex?

Thanks,

Scott
________________________________
From: Elliot West <tea...@gmail.com>
Sent: Saturday, May 26, 2018 1:58 AM
To: user@avro.apache.org
Subject: Re: Avro Schema Question

A word of caution on the union type. You may find support for unions very 
patchy if you are hoping to process records using well known data processing 
engines. We’ve been unable to usefully read union types in both Apache Spark 
and Hive for example. The simple null union construct is the exception: [null, 
typeA], as it is usually represented by a nullable columns of typeA. We’ve 
resorted to prohibiting schemas with complex unions so that our producers can’t 
create data that is not fully readable by our consumers.

Elliot.

On Fri, 25 May 2018 at 22:30, Motoko Kusanagi 
<major-motoko-kusan...@outlook.com<mailto:major-motoko-kusan...@outlook.com>> 
wrote:
Hi Michael,

Thanks!! Yes, it does.

Scott
________________________________
From: Michael Smith <micha...@syapse.com<mailto:micha...@syapse.com>>
Sent: Friday, May 25, 2018 2:21 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Subject: Re: Avro Schema Question

{"type": "int"}, {"type": "string"} is not valid json, so you definitely can't 
do that. But

[{"type": "int"}, {"type": "string"}] is a valid schema -- it can encode a 
single value that is either an int or a string. At the highest level, your 
schema can only be one type, but that type may be (and in fact probably will 
be) a complex type -- a union of records or a single record.

Does that answer your question?

On Fri, May 25, 2018 at 5:08 PM Motoko Kusanagi 
<major-motoko-kusan...@outlook.com<mailto:major-motoko-kusan...@outlook.com>> 
wrote:

Hi,


I read the specification multiple times. In the specification, it says "A 
Schema is represented in JSON<http://www.json.org/> by one of:" in the Schema 
Declaration section. The "one" confuses me as I am interpreting it as exactly 
one of the 3 that it listed.


In short, can I do this as a single schema?

{type : int},

{type : string},

{type : int},


Or do the following as a single schema?

{type : int},

{type : record ....},

{type : record ....}, // Not the same as the previous.

{type : string},


Or do I have to "embed" the above under a complex type like a record if I want 
complex schema? Or does "one of" mean I have to choose one and exactly one for 
the high top-most level of the schema?


Thanks!!



--


Michael A. Smith — Senior Systems Engineer

________________________________

micha...@syapse.com<mailto:micha...@syapse.com>
syapse.com
<http://www.syapse.com/>100 Matsonford 
Road<https://maps.google.com/?q=100+Matsonford+Rd&entry=gmail&source=g>
Five Radnor Corporate Center
Suite 444
Radnor, PA 19087
https://www.linkedin.com/in/michaelalexandersmith


[https://lh3.googleusercontent.com/8OwE1TeaqeIeUgpNi5sD9LKfc0Zl8IoENh1w5JbTbmluiHFjMqEPDL_Fl-0ulgaUPxTKEXoYlY2GIdVBSHaqLihzqQCLtJR-gwZWJt9ri0rHgb7rn0hKtqYv5m9iVMdjIUv4xlOx]

Reply via email to