Hi everyone, As already mentioned in the previous discussion thread [1] I'm opening up a parallel discussion thread on moving connectors from Flink to external connector repositories. If you haven't read up on this discussion before, I recommend reading that one first.
The goal with the external connector repositories is to make it easier to develop and release connectors by not being bound to the release cycle of Flink itself. It should result in faster connector releases, a more active connector community and a reduced build time for Flink. We currently have the following connectors available in Flink itself: * Kafka -> For DataStream & Table/SQL users * Upsert-Kafka -> For Table/SQL users * Cassandra -> For DataStream users * Elasticsearch -> For DataStream & Table/SQL users * Kinesis -> For DataStream users & Table/SQL users * RabbitMQ -> For DataStream users * Google Cloud PubSub -> For DataStream users * Hybrid Source -> For DataStream users * NiFi -> For DataStream users * Pulsar -> For DataStream users * Twitter -> For DataStream users * JDBC -> For DataStream & Table/SQL users * FileSystem -> For DataStream & Table/SQL users * HBase -> For DataStream & Table/SQL users * DataGen -> For Table/SQL users * Print -> For Table/SQL users * BlackHole -> For Table/SQL users * Hive -> For Table/SQL users I would propose to move out all connectors except Hybrid Source, FileSystem, DataGen, Print and BlackHole because: * We should avoid at all costs that certain connectors are considered as 'Core' connectors. If that happens, it creates a perception that there are first-grade/high-quality connectors because they are in 'Core' Flink and second-grade/lesser-quality connectors because they are outside of the Flink codebase. It directly hurts the goal, because these connectors are still bound to the release cycle of Flink. Last but not least, it risks any success of external connector repositories since every connector contributor would still want to be in 'Core' Flink. * To continue on the quality of connectors, we should aim that all connectors are of high quality. That means that we shouldn't have a connector that's only available for either DataStream or Table/SQL users, but for both. It also means that (if applicable) the connector should support all options, like bounded and unbounded scan, lookup, batch and streaming sink capabilities. In the end the quality should depend on the maintainers of the connector, not on where the code is maintained. * The Hybrid Source connector is a special connector because of its purpose. * The FileSystem, DataGen, Print and BlackHole connectors are important for first time Flink users/testers. If you want to experiment with Flink, you will most likely start with a local file before moving to one of the other sources or sinks. These 4 connectors can help with either reading/writing local files or generating/displaying/ignoring data. * Some of the connectors haven't been maintained in a long time (for example, NiFi and Google Cloud PubSub). An argument could be made that we check if we actually want to move such a connector or make the decision to drop the connector entirely. I'm looking forward to your thoughts! Best regards, Martijn Visser | Product Manager mart...@ververica.com [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time