PostSQL feature. gaojun2048 <[email protected]> 于2024年3月6日周三 15:25写道:
> 1. Job Run On K8s/Yarn The target zeta engine only supports local and > standalone modes when submitting jobs. Standalone is suitable for CDC > synchronization scenarios with a large number of small tables. The > characteristic of real-time CDC synchronization is that it takes up > resources for a long time but may have a small amount of data. In this > scenario, using standalone mode to share resources can effectively improve > resource utilization. But in the scenario of offline batch synchronization, > running each job using a separate process can reduce the mutual influence > between jobs. The most effective way now is to submit the job to k8s or > yarn. > 2. More connector support > 3. Catalog adapts to more connectors. Catalog related adaptations can > mainly help connectors obtain more accurate data structure information, > facilitate downstream automatic table building, and implicit data type > conversion. However, currently only a portion of connectors have > implemented interfaces equivalent to catalog, and more connectors need to > be implemented in the future. > 4. Design and adaptation of TypeConverter and DataTypeConverter. The goal > of TypeConverter is to enable each connector to more accurately describe > the conversion and inverse conversion between the database's own data type > and SeaTunnel data type. At the API level, development should be completed, > and all connectors need to be adapted and implemented in the future, > TypeConverter can help SeaTunnel better perform data model inference and > generate table creation statements during automatic table creation. > DataTypeConverter will work together with TypeConverter to help SeaTunnel > better achieve implicit conversion of data types between different > databases. For example, in the JDBC Oracle Sink scenario, when to use > setString when writing String types in SeaTunnel, and when to use blob, > DataTypeConverter will combine with TypeConverter to determine the length > of the field, the current field type, and other information. > 5. Event notification machine. Currently, SeaTunnel lacks an event passing > mechanism, such as task failure, success, and the occurrence of certain > events. > 6. Table level monitoring. The current job monitoring information is job > level, and multi table synchronization is already supported in the latest > version of SeaTunnel, which synchronizes data from multiple tables in one > job. The goal of table level monitoring is to enable users to understand > the synchronization status of each table through monitoring information. > 7. Dirty data collection. During synchronization tasks, in some cases, > some data may not be able to be written to the target end properly. The > current approach is to directly fail the job. We plan to support dirty data > collection function, and store data that cannot be written as dirty data > first, without affecting the normal operation of the job. > > Jia Fan <[email protected]> 于2024年3月6日周三 14:55写道: > >> We need to provide a mature solution based on yarn or k8s. >> >> ________________________ >> >> Jia Fan >> >> >> >> > 2024年3月5日 15:18,gaojun2048 <[email protected]> 写道: >> > >> > Hi, Community, >> > >> > The SeaTunnel community has made significant progress in 2023, with >> > SeaTunnel's features becoming increasingly powerful and the number of >> users >> > growing rapidly. Thank you to everyone in the community. >> > >> > As a professional tool for data synchronization, SeaTunnel still has a >> lot >> > of work to complete, such as run on k8s, run on yarn, more connectors, >> more >> > comprehensive support for automatic table creation, and data type >> inference. >> > >> > Here, everyone can discuss SeaTunnel's 2024 roadmap, which will >> determine >> > the main goals and directions of the community in the future. >> > >> > >> > -- >> > >> > Best Regards >> > >> > ------------ >> > >> > Apache ID: gaojun2048 >> > >> > Github ID: EricJoy2048 >> > >> > Mail: [email protected] >> >> > > -- > > Best Regards > > ------------ > > Apache ID: gaojun2048 > > Github ID: EricJoy2048 > > Mail: [email protected] > > -- Best Regards ------------ Apache ID: gaojun2048 Github ID: EricJoy2048 Mail: [email protected]
