Hi community, I would like to confront one idea with you. I was thinking that Flink SQL could be a Flink's answer for Kafka Connect (more powerful, with advantages like being decoupled from Kafka). Flink SQL would be the configuration language for Flink "connectors", sounds great!. But one thing does not allow me to implement this idea: There is no possibility to run SQL-based processing over multiple similar inputs and produce multiple similar outputs (counted in tens or hundreds). As a problem example that I need to solve, consider that I have a hundred of Kafka topics, with similar data in each. And I would like to sink them to a SQL database. With Kafka connect, I can use a single connector with JDBC sink, that properly configured will dump each topic to a separate table properly keeping the schema (based on what is in the schema registry). With Flink SQL I would need to run a query per topic/table, I believe. Similarly with sourcing data. There is this cool project flink-cdc-connectors [1] that leverages Debezium in Flink to apply CDC on SQL database, but when used with SQL, it can only pull in one table per query. These cases can be solved using the datastream API. With it I can code pulling in/pushing out multiple table streams. But then "the configuration" is a much bigger effort, because it requires using java code. And that is a few hours vs few days case, an enormous difference.
So in the end some questions: * Do you know how SQL could be extended to support handling such cases elegantly, with a single job in the end? * Or do you believe SQL should not be used for that case and we should come up with a different tool and configuration language? I.e. sth like Kafka Connect * Do you know of any other project that implements this idea? I definitely believe that this is a great use case for Flink to be an easy-to-use ingress from/egress to Kafka/HDFS/whatever system, therefore there is a need for a solution for my case. Thanks for answering! Krzysztof [1] https://github.com/ververica/flink-cdc-connectors
