Hi all, I’ve been thinking a lot recently about JSON and Kafka. Because JSON is not strongly typed, it isn’t treated as a first class citizen of the Kafka ecosystem. At Wikimedia, we use JSONSchema validated JSON <https://blog.wikimedia.org/2017/01/13/json-hadoop-kafka/> for Kafka messages. This makes it so easy for our many disparate teams and services to consume data from Kafka, without having to consult a remote schema registry to read data. (Yes we have to worry about schema evolution, but we do this on the producer side by requiring that the only schema change allowed is adding optional fields.)
There’s been discussion <https://github.com/confluentinc/schema-registry/issues/220> about JSONSchema support in Confluent’s Schema registry, or perhaps even support to produce validated Avro JSON (not binary) from Kafka REST proxy. However, the more I think about this, I realize that I don’t really care about JSON support in Confluent products. What I (and I betcha most of the folks who commented on the issue <https://github.com/confluentinc/schema-registry/issues/220>) really want is the ability to use Kafka Connect with JSON data. Kafka Connect does sort of support this, but only if your JSON messages conform to its very specific envelope schema format <https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonSchema.java#L61> . What if…Kafka Connect provided a JSONSchemaConverter (*not* Connect’s JsonConverter), that knew how to convert between a provided JSONSchema and Kafka Connect internal Schemas? Would this enable what I think it would? Would this allow for configuration of Connectors with JSONSchemas to read JSON messages directly from a Kafka topic? Once read and converted to a ConnectRecord, the messages could be used with any Connector out there, right? I might have space in the next year to work on something like this, but I thought I’d ask here first to see what others thought. Would this be useful? If so, is this something that might be upstreamed into Apache Kafka? - Andrew Otto Senior Systems Engineer Wikimedia Foundation