echauchot commented on code in PR #641: URL: https://github.com/apache/flink-web/pull/641#discussion_r1183612883
########## docs/content/posts/2023-04-13-howto-create-batch-source.md: ########## @@ -0,0 +1,280 @@ +--- +title: "Howto create a batch source with the new Source framework" +date: "2023-04-13T08:00:00.000Z" +authors: + +- echauchot: + name: "Etienne Chauchot" + twitter: "echauchot" + +--- + +## Introduction + +The Flink community has +designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/) +based +on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface) +lately. Some connectors have migrated to this new framework. This article is a how-to for creating a +batch +source using this new framework. It was built while implementing +the [Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca) +for [Cassandra](https://cassandra.apache.org/_/index.html). +If you are interested in contributing or migrating connectors, this blog post is for you!. + +## Implementing the source components + +The aim here is not to duplicate the official documentation. For details you should read the +documentation, the javadocs or the Cassandra connector code. The links are above. The goal here is +to give field feedback on how to implement the different components. + +The source architecture is depicted in the diagrams below: + +![](/img/blog/2023-04-13-howto-create-batch-source/source_components.svg) + +![](/img/blog/2023-04-13-howto-create-batch-source/source_reader.svg) + +### Source + +[example Cassandra Source](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/CassandraSource.java) + +The source interface only does the "glue" between all the other components. Its role is to +instantiate all of them and to define the +source [Boundedness](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/connector/source/Boundedness.html) +. We also do the source configuration +here along with user configuration validation. + +### SourceReader + +[example Cassandra SourceReader](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/reader/CassandraSourceReader.java) + +As shown in the graphic above, the instances of +the [SourceReader](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/connector/source/SourceReader.html) ( +which we will call simply readers Review Comment: yes it is the auto format that messed it up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org