Hi, In my role as a community contributor, I would like to say that we are currently experiencing a few problems with the community: 1: A connector needs to be written twice (to adapt to different engines), even if it is actually copied, there is not much difference. 2: The adaptation of Flink or Spark version leads to a large number of code modifications, and each upgrade or version adaptation faces a large number of tests and modifications. 3: When we want to consider the improvement of peripheral functions, such as UI and monitoring, we actually need a unified API.
I'm glad to see Zongwen bring up the discussion on this, I would say, it's a good idea, my only concern is whether we can complete the abstract design of the API due to the differences in the engines themselves (API). It's just a conservative concern, but anyway, I support the design and want to join it. BTW, we only support English discussion. Best wishes! Calvin Kirs On 05/8/2022 15:20,jianju1024<[email protected]> wrote: 大家好, 我也有同感:像Spark?Flink?Beam?四不像吧!! 1. 为什么要拆解ETL?到底什么是E T L?本身就解不开 2. 基于多套引擎的话题,我在几个微信群里,不少人讨论很久,也没有定论 3. 如果是解决ETL难题,为什么要纠结于抽象在两套不同的引擎上呢? 4. 相比DataX,FlinkX,差太多了吧 2022年5月7日 17:36,Zongwen Li <[email protected]> 写道: The goal of Apache SeaTunnel is different from Apache Beam. Apache SeaTunnel focuses on source and sink connectors, and develops features in the field of data integration; Apache Beam focuses and unifies all the functions of the compute engine, including operators such as join, connect, map, etc. and it doesn't unify streaming and batch source. This improvement proposal is to solve the current problems encountered by SeaTunnel . If you have better ideas, you can bring them up for discussion. Best, Zongwen Li leo65535 <[email protected]> 于2022年4月29日周五 16:14写道: Hi @zongwen, I think this is not a good idea, it seems that we will be more and more like Apache Beam, Best, Leo65535 At 2022-04-18 15:10:08, "李宗文" <[email protected]> wrote: Hi All, In the current implementation of SeaTunnel, the connector is coupled with the computing engine, which results in a connector that needs to be implemented for each engine, and it is difficult to support multiple versions of the engine. Through the questionnaire, it was found that users used multiple versions of Spark and Flink engines, and they also hoped that SeaTunnel would support Change Data Capture (CDC) connectors; Based on the above questions and needs, I created an improvement proposal: https://github.com/apache/incubator-seatunnel/issues/1608 Preliminary idea of Source and Sink API: https://github.com/apache/incubator-seatunnel/issues/1701 https://github.com/apache/incubator-seatunnel/issues/1704 Please discuss away! Zongwen Li
