Hi, Luis: Thanks a lot for your reply!.
Sincerely, Selina On Fri, Nov 20, 2015 at 12:09 PM, Luis Casillas < luis.casil...@progressfin.com> wrote: > > We haven’t seriously considered Protocol Buffers. In general the tools > we’re interested in have better support for Avro than for protobuf; Avro > was designed for storing data in big-data storage like HDFS, and many tools > for analyzing such data have taken it up. For example Hive comes with Avro > support built in. > > More generally, we like the design choices that Avro has made: > > 1. Self-describing container files > 2. Easy convertibility to/from JSON > 3. Not tightly tied to code generation > > > > We’ve experienced these downsides, however: > > 1. We’ve been bit hard by buggy Avro library versions. You want to stick > to the latest one. > 2. Hadoop ships with such an older, buggy version of Avro, and it is a > major pain to work around it. > 3. Avro's “one definition file = one schema = one record type” assumption > causes us some trouble. > > > On 11/20/15, 2:47 AM, "Selina Tech" <swucaree...@gmail.com> wrote: > > >Hi, Luis: > > Thanks a lot for your detail reply with your codes and link of > Avro > >schema registry. > > May I have a question, have you considered protocol buffer as your > >message type? > > > >Sincerely, > >Selina > > > > > >On Thu, Nov 19, 2015 at 2:22 PM, Luis Casillas < > >luis.casil...@progressfin.com> wrote: > > > >> > >> I did a Samza proof of concept project recently and I ended up writing > >> this code: > >> > >> https://gist.github.com/ldcasillas-progreso/871af3c1a1790be975fd > >> > >> In the end, however, I switched the project from Avro to JSON. The > issue > >> is that Avro is designed to work with its self-describing container file > >> format, which embeds the schema used to write the records in the file. > >> Avro’s schema evolution features rely on this embedded schema; when the > >> embedded schema and the reader’s schema are not equal, Avro uses its > >> special rules to translate the old data to the new schema. > >> > >> But when you’re working with Kafka/Samza, there is no container file. > >> Therefore, none of the schema evolution tools work. Therefore, if you > >> change your Avro schema, you likely won’t be able to read any of the old > >> messages again. > >> > >> There’s a Kafka Avro schema registry project that aims to fix this: > >> > >> https://github.com/confluentinc/schema-registry > >> > >> I tried it but the released version just was not mature enough—which is > >> why I ended up using JSON. But I did write a Serde that encodes/decodes > >> the Avro objects in JSON: > >> > >> https://gist.github.com/ldcasillas-progreso/3611d40d2833aa62c1b3 > >> > >> Hope this helps. > >> > >> > >> > >> > >> > >> On 11/17/15, 12:32 AM, "Selina Tech" <swucaree...@gmail.com> wrote: > >> > >> >Dear All: > >> > Do you know where I can find the tutorial or sample code for > writing > >> >Avro type message to Kafka and reading Avro type message from Kafka in > >> >Samza? > >> > I am wondering how should I serialized GenericRecord to byte and > >> >deserialized it? > >> > Your comments/suggestion are highly appreciated. > >> > > >> >Sincerely, > >> >Selina > >> > >> > >> ----------- > >> This message and any files or text attached to it are intended only for > >> the recipients named above, and contain information that is > confidential or > >> privileged. If you are not an intended recipient, you must not read, > copy, > >> use or disclose this communication. Please also notify the sender by > >> replying to this message, and then delete all copies of it from your > system. > >> > >> Este mensaje y cualquier archivo o texto adjunto es dirigido solamente a > >> los destinatarios especificados en el encabezado y contiene información > >> confidencial y/o privilegiada. Si usted no es el destinatario no deberá > >> leer, copiar, usar o divulgar el contenido. Por favor notifique al > >> remitente, respondiendo a esté mensaje y elimine todas las copias del > mismo > >> de su sistema. > >> > > > ----------- > This message and any files or text attached to it are intended only for > the recipients named above, and contain information that is confidential or > privileged. If you are not an intended recipient, you must not read, copy, > use or disclose this communication. Please also notify the sender by > replying to this message, and then delete all copies of it from your system. > > Este mensaje y cualquier archivo o texto adjunto es dirigido solamente a > los destinatarios especificados en el encabezado y contiene información > confidencial y/o privilegiada. Si usted no es el destinatario no deberá > leer, copiar, usar o divulgar el contenido. Por favor notifique al > remitente, respondiendo a esté mensaje y elimine todas las copias del mismo > de su sistema. >