Hi Martin,

Thank you for your clear answer.
I will test the example you provide.
In this case it is strongly not recommended to use binary avro as a blob in a 
database.
It is very difficult if not impossible to deserialize with a single reader all 
lines.
Best regards.
Youcef.

De : Martin Kleppmann [mailto:mar...@kleppmann.com]
Envoyé : lundi 14 décembre 2015 22:46
À : <user@avro.apache.org>
Objet : Re: add a type to a union

Hi Youcef,

Glad you found my old blog post on Avro schema evolution :)

I encourage you to try a simple example, which will make it clearer: 
https://gist.github.com/ept/5fd7c625969248b31e73

In this example, the writer has a union of null, string and long, whereas the 
reader only has a union of null and string. A default value of null is set. If 
the record has a null or string value, it is correctly parsed by the reader. If 
the record has a long value, the reader throws an exception, because it is not 
one of the union datatypes it is expecting.

So the default value unfortunately doesn't help here. If you want to add a new 
branch to a union schema, you have to make sure that all the readers are 
updated with the new schema first, and only then should writers start 
generating data with the new schema.

Hope that helps.
Martin


On 7 Dec 2015, at 22:15, HILEM Youcef 
<youcef.hi...@laposte.fr<mailto:youcef.hi...@laposte.fr>> wrote:

Hi,

At La Poste Pôle Colis we use Avro in our new reactive architecture (kafka, 
spark streaming, Cassandra, elasticsearch, play framework).

In our modeling we used the type union to bring together in one schema all 
trace events of a package (arrival, departure, transportation, ...) at the body 
attribute.

Example :
{
"namespace" : "fr.laposte.colis.schema.pivot.message",
"name" : "Message",
"type" : "record",
"doc" : "Cette structure défini les caractéristiques de base d'un message. 
Peut(doit) être spécialisée pour un usage particulier",
                                "fields" : [
                                               {
                                                               "name" : 
"header",
                                                               "type" : 
"fr.laposte.colis.schema.pivot.common.message.MessageHeader",
                                                               "doc" :  "Entête 
du message"
                                               },{
                                                               "name" : "body",
                                                               "type" : 
["fr.laposte.colis.schema.pivot.announcement.AnnouncementEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.delivery.DeliveryEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.handling.HandlingEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.crm.CrmEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.customs.transport.CustomsTransportMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.customs.consignment.CustomsContainerEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.customs.consignment.CustomsParcelEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.rest.common.Rest",
                                                                              
"fr.laposte.colis.schema.pivot.reject.RejectMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.dpmo.defectrequest.DefectRequestEventMessageBody",
                                                                              
"fr.laposte.colis.schema.pivot.dpmo.defectresult.DefectResultEventMessageBody",
                                                                               
"fr.laposte.colis.schema.timeout.TimeoutMessageBody",
                                                                               
"fr.laposte.colis.schema.notification.Notification"
                                                                              ],
                                                               "doc" :  
"Abstraction du corps de message. Peut-être substitué par tout type dérivé du 
type MessageBody"
                                               }
                                ]
}

However, as well explained at 
(https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html)
 : “Union types are powerful, but you must take care when changing them. If you 
want to add a type to a union, you first need to update all readers with the 
new schema, so that they know what to expect. Only once all readers are 
updated, the writers may start putting this new type in the records they 
generate”

My question : is a default value for field “body” is sufficient so that if the 
reader encounters a union branch it does not know about, it can substitute the 
default value (see 
http://grokbase.com/t/avro/user/11b3bn6r6z/does-extending-union-break-compatibility)
 ?

Thank you in advance for your help.


Post-scriptum La Poste

Ce message est confidentiel. Sous reserve de tout accord conclu par
ecrit entre vous et La Poste, son contenu ne represente en aucun cas un
engagement de la part de La Poste. Toute publication, utilisation ou
diffusion, meme partielle, doit etre autorisee prealablement. Si vous
n'etes pas destinataire de ce message, merci d'en avertir immediatement
l'expediteur.

Reply via email to