Hello, We are building a data processing system that has the following required properties:
- Data is produced/consumed in JSON format - These JSON documents must always adhere to a schema - The schema must be defined in JSON also - It should be possible to evolve schemas and verify schema compatibility I initially started looking at Avro, not as a solution, but to understand how it schema evolution can be managed. However, I quickly discovered that with its JSON support it is able to meet all of my requirements. I am now considering a system where data structure is defined using the Avro JSON schema, data is submitted using JSON that is then internally decoded into Avro records, these records are then eventually encoded back into JSON at the point of consumption. It seems to me that I can then take advantage of Avro’s schema evolution features, while only ever exposing JSON to consumers and producers. Aside from the dependency on Avro’s JSON schema syntax, the use of Avro then becomes an internal implementation detail. As I am completely new to Avro, I was wondering if this is a credible idea, or if anyone would care to share their experiences of similar systems that they have built? Many thanks, Elliot.