Dear All: I need to generate some data by Samza to Kafka and then write to Parquet formate file. I was asked why I choose Avro type as my Samza output to Kafka instead of Protocol Buffer. Since currently our data on Kafka are all Protocol buffer type message.
I explained that Avro encoded message has advantages such as, the encoded size smaller, no extra code compile, implementation easier. fast to serialize/deserialize and supporting a lot language. However some people believe when encoded the Avro message take as much space as Protocol buffer, but with schema, the size could be much bigger. I am wondering if there are any other advantages make you choose Avro as your message type at Kafka? How you consider the data size for Avro vs Protocol buffer? Sincerely, Selina