Dear All:

      I need to generate some data by Samza to Kafka and then write to
Parquet formate file.  I was asked why I choose Avro type as my Samza
output to Kafka instead of Protocol Buffer. Since currently our data on
Kafka are all Protocol buffer type message.

      I explained that Avro encoded message has advantages such as, the
encoded size smaller, no extra code compile, implementation easier.  fast
to serialize/deserialize and supporting a lot language.  However some
people believe when encoded the Avro message take as much space as Protocol
buffer, but with schema, the size could be much bigger.

      I am wondering if there are any other advantages make you choose Avro
as your message type at Kafka? How you consider the data size for Avro vs
Protocol buffer?

Sincerely,
Selina

Reply via email to