We haven’t seriously considered Protocol Buffers.  In general the tools we’re 
interested in have better support for Avro than for protobuf; Avro was designed 
for storing data in big-data storage like HDFS, and many tools for analyzing 
such data have taken it up.  For example Hive comes with Avro support built in.

More generally, we like the design choices that Avro has made:

1. Self-describing container files
2. Easy convertibility to/from JSON
3. Not tightly tied to code generation



We’ve experienced these downsides, however:

1. We’ve been bit hard by buggy Avro library versions.  You want to stick to 
the latest one.
2. Hadoop ships with such an older, buggy version of Avro, and it is a major 
pain to work around it.
3. Avro's “one definition file = one schema = one record type” assumption 
causes us some trouble.


On 11/20/15, 2:47 AM, "Selina Tech" <swucaree...@gmail.com> wrote:

>Hi, Luis:
>        Thanks a lot for your detail reply with your codes and link of  Avro
>schema registry.
>        May I have a question, have you considered protocol buffer as your
>message type?
>
>Sincerely,
>Selina
>
>
>On Thu, Nov 19, 2015 at 2:22 PM, Luis Casillas <
>luis.casil...@progressfin.com> wrote:
>
>>
>> I did a Samza proof of concept project recently and I ended up writing
>> this code:
>>
>> https://gist.github.com/ldcasillas-progreso/871af3c1a1790be975fd
>>
>> In the end, however, I switched the project from Avro to JSON.  The issue
>> is that Avro is designed to work with its self-describing container file
>> format, which embeds the schema used to write the records in the file.
>> Avro’s schema evolution features rely on this embedded schema; when the
>> embedded schema and the reader’s schema are not equal, Avro uses its
>> special rules to translate the old data to the new schema.
>>
>> But when you’re working with Kafka/Samza, there is no container file.
>> Therefore, none of the schema evolution tools work.  Therefore, if you
>> change your Avro schema, you likely won’t be able to read any of the old
>> messages again.
>>
>> There’s a Kafka Avro schema registry project that aims to fix this:
>>
>> https://github.com/confluentinc/schema-registry
>>
>> I tried it but the released version just was not mature enough—which is
>> why I ended up using JSON.  But I did write a Serde that encodes/decodes
>> the Avro objects in JSON:
>>
>> https://gist.github.com/ldcasillas-progreso/3611d40d2833aa62c1b3
>>
>> Hope this helps.
>>
>>
>>
>>
>>
>> On 11/17/15, 12:32 AM, "Selina Tech" <swucaree...@gmail.com> wrote:
>>
>> >Dear All:
>> >     Do you know where I can find the tutorial or sample code for writing
>> >Avro type message to Kafka and reading Avro type message from Kafka in
>> >Samza?
>> >      I am wondering how should I serialized GenericRecord to byte and
>> >deserialized it?
>> >     Your comments/suggestion are highly appreciated.
>> >
>> >Sincerely,
>> >Selina
>>
>>
>> -----------
>> This message and any files or text attached to it are intended only for
>> the recipients named above, and contain information that is confidential or
>> privileged. If you are not an intended recipient, you must not read, copy,
>> use or disclose this communication. Please also notify the sender by
>> replying to this message, and then delete all copies of it from your system.
>>
>> Este mensaje y cualquier archivo o texto adjunto es dirigido solamente a
>> los destinatarios especificados en el encabezado y contiene información
>> confidencial y/o privilegiada. Si usted no es el destinatario no deberá
>> leer, copiar, usar o divulgar el contenido. Por favor notifique al
>> remitente, respondiendo a esté mensaje y elimine todas las copias del mismo
>> de su sistema.
>>


-----------
This message and any files or text attached to it are intended only for the 
recipients named above, and contain information that is confidential or 
privileged. If you are not an intended recipient, you must not read, copy, use 
or disclose this communication. Please also notify the sender by replying to 
this message, and then delete all copies of it from your system.

Este mensaje y cualquier archivo o texto adjunto es dirigido solamente a los 
destinatarios especificados en el encabezado y contiene información 
confidencial y/o privilegiada. Si usted no es el destinatario no deberá leer, 
copiar, usar o divulgar el contenido. Por favor notifique al remitente, 
respondiendo a esté mensaje y elimine todas las copias del mismo de su sistema.

Reply via email to