[GitHub] [incubator-seatunnel] CalvinKirs commented on a diff in pull request #3619: [Doc] improve README and other documents

GitBox Wed, 30 Nov 2022 18:10:42 -0800


CalvinKirs commented on code in PR #3619:
URL: 
https://github.com/apache/incubator-seatunnel/pull/3619#discussion_r1036624475



##########
README.md:
##########
@@ -19,49 +19,43 @@ been used in the production of nearly 100 companies.
 
 ## Why do we need SeaTunnel
 
-SeaTunnel will do its best to solve the problems that may be encountered in 
the synchronization of massive data:
+SeaTunnel focuses on data integration and data synchronization, and is mainly 
designed to solve common problems in the field of data integration:
 
-- Data loss and duplication
-- Task accumulation and delay
-- Low throughput
-- Long cycle to be applied in the production environment
-- Lack of application running status monitoring
+- Various data sources: There are hundreds of commonly-used data sources of 
which versions are incompatible. With the emergence of new technologies, more 
data sources are appearing. It is difficult for users to find a tool that can 
fully and quickly support these data sources.
+- Complex synchronization scenarios: Data synchronization needs to support 
various synchronization scenarios such as offline-full synchronization, 
offline-incremental synchronization, CDC, real-time synchronization, and full 
database synchronization.
+- High demand in resource: Existing data integration and data synchronization 
tools often require vast computing resources or JDBC connection resources to 
complete real-time synchronization of massive small tables. This has increased 
the burden on enterprises to a certain extent.
+- Lack of quality and monitoring: Data integration and synchronization 
processes often experience loss or duplication of data. The synchronization 
process lacks monitoring, and it is impossible to intuitively understand the 
real-situation of the data during the task process.
+- Complex technology stack: The technology components used by enterprises are 
different, and users need to develop corresponding synchronization programs for 
different components to complete data integration.
+- Difficulty in management and maintenance: Limited to different underlying 
technology components (Flink/Spark) , offline synchronization and real-time 
synchronization often have be developed and managed separately, which increases 
the difficulty of the management and maintainance.
 
-## SeaTunnel use scenarios
+## Features of SeaTunnel
 
-- Mass data synchronization
-- Mass data integration
-- ETL with massive data
-- Mass data aggregation
-- Multi-source data processing
+- Rich and extensible Connector: SeaTunnel provides a Connector API that does 
not depend on a specific execution engine. Connectors (Source, Transform, Sink) 
developed based on this API can run on many different engines, such as 
SeaTunnel Engine, Flink, Spark that are currently supported.
+- Connector plugin: The plugin design allows users to easily develop their own 
Connector and integrate it into the SeaTunnel project. Currently, SeaTunnel has 
supported more than 70 Connectors, and the number is surging. There is the list 
of the currently supported connectors: xxxxxxx, and the list of planned 
connectors: xxxxxxx.

Review Comment:
   We can link connector-status and connector-support plan to here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-seatunnel] CalvinKirs commented on a diff in pull request #3619: [Doc] improve README and other documents

Reply via email to