flink cdc整库同步大小表造成数据倾斜问题

2024-02-06 Thread casel.chen
使用flink cdc 3.0 yaml作业进行mysql到doris整库同步时发现有数据倾斜发生,大的TM要处理180G数据,小的TM只有30G数据,上游有的大表流量很大,而小表几乎没有流量,有什么办法可以避免发生数据倾斜问题么?

Watermark alignment without idleness

2024-02-06 Thread Alexis Sarda-Espinosa
Hello, I was reading through the comments in [1] and it seems that enabling watermark alignment implicitly activates some idleness logic "if the source waits for alignment for a long time" (even if withIdleness is not called explicitly during the creation of WatermarkStrategy). Is this time

[ANNOUNCE] Apache Celeborn(incubating) 0.4.0 available

2024-02-06 Thread Fu Chen
Hi all, Apache Celeborn(Incubating) community is glad to announce the new release of Apache Celeborn(Incubating) 0.4.0. Celeborn is dedicated to improving the efficiency and elasticity of different map-reduce engines and provides an elastic, high-efficient service for intermediate data including

Re: Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-06 Thread Alexis Sarda-Espinosa
Hello, check this thread from some months ago, but keep in mind that it's not really officially supported by Flink itself: https://lists.apache.org/thread/l0pgm9o2vdywffzdmbh9kh7xorhfvj40 Regards, Alexis. Am Di., 6. Feb. 2024 um 12:23 Uhr schrieb Fidea Lidea < lideafidea...@gmail.com>: > Hi

Re: Idleness not working if watermark alignment is used

2024-02-06 Thread Alexis Sarda-Espinosa
Hi Matthias, I think I understand the implications of idleness. In my case I really do need it since even in the production environment one of the Kafka topics will receive messages only sporadically. With regards to the code, I have very limited understanding of Flink internals, but that part I

Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-06 Thread Fidea Lidea
Hi Team, I request you to provide sample codes on data streaming using flink, kafka and spring boot. Awaiting your response. Thanks & Regards Nida Shaikh

RE: Idleness not working if watermark alignment is used

2024-02-06 Thread Schwalbe Matthias
Hi Alexis, Yes, I guess so, while not utterly acquainted with that part of the code. Apparently the SourceCoordinator cannot come up with a proper watermark time, if watermarking is turned off (idle mode of stream), and then it deducts watermark time from the remaining non-idle sources. It’s

Re: Idleness not working if watermark alignment is used

2024-02-06 Thread Alexis Sarda-Espinosa
Hi Matthias, thanks for looking at this. Would you then say this comment in the source code is not really valid? https://github.com/apache/flink/blob/release-1.18/flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinator.java#L181 That's where the log I was