Hi, Thanks for driving this, +1 for the FLIP.
Best, Ferenc On Monday, March 11th, 2024 at 15:17, Ahmed Hamdy <[email protected]> wrote: > > > Hello, > Thanks for the proposal, +1 for the FLIP. > > Best Regards > Ahmed Hamdy > > > On Mon, 11 Mar 2024 at 15:12, wudi [email protected] wrote: > > > Hi, Leonard > > Thank you for your suggestion. > > I referred to other Connectors[1], modified the naming and types of > > relevant parameters[2], and also updated FLIP. > > > > [1] > > https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/ > > [1] > > https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java > > > > Brs, > > di.wu > > > > > 2024年3月7日 14:33,Leonard Xu [email protected] 写道: > > > > > > Thanks wudi for the updating, the FLIP generally looks good to me, I > > > only left two minor suggestions: > > > > > > (1) The suffix `.s` in configoption doris.request.query.timeout.s looks > > > strange to me, could we change all time interval related option value type > > > to Duration ? > > > > > > (2) Could you check and improve all config options like > > > `doris.exec.mem.limit` to make them to follow flink config option naming > > > and value type? > > > > > > Best, > > > Leonard > > > > > > > > 2024年3月6日 06:12,Jing Ge [email protected] 写道: > > > > > > > > > > Hi Di, > > > > > > > > > > Thanks for your proposal. +1 for the contribution. I'd like to know > > > > > your > > > > > thoughts about the following questions: > > > > > > > > > > 1. According to your clarification of the exactly-once, thanks for it > > > > > BTW, > > > > > no PreCommitTopology is required. Does it make sense to let > > > > > DorisSink[1] > > > > > implement SupportsCommitter, since the TwoPhaseCommittingSink is > > > > > deprecated[2] before turning the Doris connector into a Flink > > > > > connector? > > > > > 2. OLAP engines are commonly used as the tail/downstream of a data > > > > > pipeline > > > > > to support further e.g. ad-hoc query or cube with feasible > > > > > pre-aggregation. > > > > > Just out of curiosity, would you like to share some real use cases > > > > > that > > > > > will use OLAP engines as the source of a streaming data pipeline? Or > > > > > it > > > > > will only be used as the source for the batch? > > > > > 3. The E2E test only covered sink[3], if I am not mistaken. Would you > > > > > like > > > > > to test the source in E2E too? > > > > > > > > > > [1] > > > > https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/DorisSink.java#L55 > > > > > > > [2] > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API > > > > > > > [3] > > > > https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/test/java/org/apache/doris/flink/tools/cdc/MySQLDorisE2ECase.java#L96 > > > > > > > Best regards, > > > > > Jing > > > > > > > > > > On Tue, Mar 5, 2024 at 11:18 AM wudi [email protected] wrote: > > > > > > > > > > > Hi, Jeyhun Karimov. > > > > > > Thanks for your question. > > > > > > > > > > > > - How to ensure Exactly-Once? > > > > > > 1. When the Checkpoint Barrier arrives, DorisSink will trigger the > > > > > > precommit api of StreamLoad to complete the persistence of data in > > > > > > Doris > > > > > > (the data will not be visible at this time), and will also pass this > > > > > > TxnID > > > > > > to the Committer. > > > > > > 2. When this Checkpoint of the entire Job is completed, the > > > > > > Committer > > > > > > will > > > > > > call the commit api of StreamLoad and commit TxnID to complete the > > > > > > visibility of the transaction. > > > > > > 3. When the task is restarted, the Txn with successful precommit and > > > > > > failed commit will be aborted based on the label-prefix, and Doris' > > > > > > abort > > > > > > API will be called. (At the same time, Doris will also abort > > > > > > transactions > > > > > > that have not been committed for a long time) > > > > > > > > > > > > ps: At the same time, this part of the content has been updated in > > > > > > FLIP > > > > > > > > > > > > - Because the default table model in Doris is Duplicate ( > > > > > > https://doris.apache.org/docs/data-table/data-model/), which does > > > > > > not > > > > > > have a primary key, batch writing may cause data duplication, but > > > > > > UNIQ The > > > > > > model has a primary key, which ensures the idempotence of writing, > > > > > > thus > > > > > > achieving Exactly-Once > > > > > > > > > > > > Brs, > > > > > > di.wu > > > > > > > > > > > > > 2024年3月2日 17:50,Jeyhun Karimov [email protected] 写道: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Thanks for the proposal. +1 for the FLIP. > > > > > > > I have a few questions: > > > > > > > > > > > > > > - How exactly the two (Stream Load's two-phase commit and Flink's > > > > > > > two-phase > > > > > > > commit) combination will ensure the e2e exactly-once semantics? > > > > > > > > > > > > > > - The FLIP proposes to combine Doris's batch writing with the > > > > > > > primary key > > > > > > > table to achieve Exactly-Once semantics. Could you elaborate more > > > > > > > on > > > > > > > that? > > > > > > > Why it is not the default behavior but a workaround? > > > > > > > > > > > > > > Regards, > > > > > > > Jeyhun > > > > > > > > > > > > > > On Sat, Mar 2, 2024 at 10:14 AM Yanquan Lv [email protected] > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks for driving this. > > > > > > > > The content is very detailed, it is recommended to add a > > > > > > > > section on > > > > > > > > Test > > > > > > > > Plan for more completeness. > > > > > > > > > > > > > > > > Di Wu [email protected] 于2024年1月25日周四 15:40写道: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > Previously, we had some discussions about contributing Flink > > > > > > > > > Doris > > > > > > > > > Connector to the Flink community [1]. I want to further > > > > > > > > > promote > > > > > > > > > this > > > > > > > > > work. > > > > > > > > > I hope everyone will help participate in this FLIP discussion > > > > > > > > > and > > > > > > > > > provide > > > > > > > > > more valuable opinions and suggestions. > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p > > > > > > > > > > > > > > > > > > Brs, > > > > > > > > > di.wu > > > > > > > > > > > > > > > > > > On 2023/12/07 05:02:46 wudi wrote: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > As discussed in the previous email [1], about contributing > > > > > > > > > > the > > > > > > > > > > Flink > > > > > > > > > > Doris Connector to the Flink community. > > > > > > > > > > > > > > > > > > > > Apache Doris[2] is a high-performance, real-time analytical > > > > > > > > > > database > > > > > > > > > > based on MPP architecture, for scenarios where Flink is > > > > > > > > > > used for > > > > > > > > > > data > > > > > > > > > > analysis, processing, or real-time writing on Doris, Flink > > > > > > > > > > Doris > > > > > > > > > > Connector > > > > > > > > > > is an effective tool. > > > > > > > > > > > > > > > > > > > > At the same time, Contributing Flink Doris Connector to the > > > > > > > > > > Flink > > > > > > > > > > community will further expand the Flink Connectors > > > > > > > > > > ecosystem. > > > > > > > > > > > > > > > > > > > > So I would like to start an official discussion FLIP-399: > > > > > > > > > > Flink > > > > > > > > > > Connector Doris[3]. > > > > > > > > > > > > > > > > > > > > Looking forward to comments, feedbacks and suggestions from > > > > > > > > > > the > > > > > > > > > > community on the proposal. > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p > > > > > > > > > > [2] > > > > https://doris.apache.org/docs/dev/get-starting/what-is-apache-doris/ > > > > > > > > > > > > [3] > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris > > > > > > > > > > > > Brs, > > > > > > > > > > > > > > > > > > > > di.wu
