SQL for Kafka topics

Rahul Singh Thu, 14 Jun 2018 05:16:48 -0700

I have a Kafka related challenge and hoping someone else has faced this or has 
some pointers. This is NOT a *schema registry* question, it is a question 
regarding the generation of schemas. I already know how I’m managing these 
schemas once they are created.


I need to manage potentially several hundred topics which are primarily sourced 
from sources in a relational database accessible via JDBC and there several 
hundred consumers which will subscribe to them.

There are always changes that happen to the relational schema and thus need to 
be made to the avro schema which is being used in the topic and the processors.


I have a few solutions in mind:

1. Use Spark-Avro from Databricks to load the tables into a dataframe and then 
write using avro format, which then I have as a starting point.

2. Use Avro-SQL from Landoop -- but not sure if I need to have an existing 
table or if I can just give it arbitrary SQL.

3. Use other tools such as csv to avro, json to avro, but for each I need to do 
some preprocessing to create JSON to Avro, etc.

4. Any other options?

Goal is to walk through the tables in the database, review the metadata and 
generate Avro schemas, which would then be versioned / managed elsewhere. If 
there are changes to the topic group in general, we'd be automatically 
deleting/ adding topics to Kafka. I just don't want to task the team with 
manualy creating these avro schemas / topics.

If I'm going about it completely outside of left field, let me know.

Best,

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

Avro Schema Generation from Json/Csv/SQL for Kafka topics

Reply via email to