Hello,
Thanks for the proposal, +1 for the FLIP.

Best Regards
Ahmed Hamdy


On Mon, 11 Mar 2024 at 15:12, wudi <676366...@qq.com.invalid> wrote:

> Hi, Leonard
> Thank you for your suggestion.
> I referred to other Connectors[1], modified the naming and types of
> relevant parameters[2], and also updated FLIP.
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/
> [1]
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java
>
> Brs,
> di.wu
>
> > 2024年3月7日 14:33,Leonard Xu <xbjt...@gmail.com> 写道:
> >
> > Thanks wudi for the updating, the FLIP generally looks good to me, I
> only left two minor suggestions:
> >
> > (1) The suffix `.s` in configoption doris.request.query.timeout.s looks
> strange to me, could we change all time interval related option value type
> to Duration ?
> >
> > (2) Could you check and improve all config options  like
> `doris.exec.mem.limit` to make them to follow flink config option naming
> and value type?
> >
> > Best,
> > Leonard
> >
> >
> >>
> >>
> >>> 2024年3月6日 06:12,Jing Ge <j...@ververica.com.INVALID> 写道:
> >>>
> >>> Hi Di,
> >>>
> >>> Thanks for your proposal. +1 for the contribution. I'd like to know
> your
> >>> thoughts about the following questions:
> >>>
> >>> 1. According to your clarification of the exactly-once, thanks for it
> BTW,
> >>> no PreCommitTopology is required. Does it make sense to let
> DorisSink[1]
> >>> implement SupportsCommitter, since the TwoPhaseCommittingSink is
> >>> deprecated[2] before turning the Doris connector into a Flink
> connector?
> >>> 2. OLAP engines are commonly used as the tail/downstream of a data
> pipeline
> >>> to support further e.g. ad-hoc query or cube with feasible
> pre-aggregation.
> >>> Just out of curiosity, would you like to share some real use cases that
> >>> will use OLAP engines as the source of a streaming data pipeline? Or it
> >>> will only be used as the source for the batch?
> >>> 3. The E2E test only covered sink[3], if I am not mistaken. Would you
> like
> >>> to test the source in E2E too?
> >>>
> >>> [1]
> >>>
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/DorisSink.java#L55
> >>> [2]
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
> >>> [3]
> >>>
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/test/java/org/apache/doris/flink/tools/cdc/MySQLDorisE2ECase.java#L96
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>> On Tue, Mar 5, 2024 at 11:18 AM wudi <676366...@qq.com.invalid> wrote:
> >>>
> >>>> Hi, Jeyhun Karimov.
> >>>> Thanks for your question.
> >>>>
> >>>> - How to ensure Exactly-Once?
> >>>> 1. When the Checkpoint Barrier arrives, DorisSink will trigger the
> >>>> precommit api of StreamLoad to complete the persistence of data in
> Doris
> >>>> (the data will not be visible at this time), and will also pass this
> TxnID
> >>>> to the Committer.
> >>>> 2. When this Checkpoint of the entire Job is completed, the Committer
> will
> >>>> call the commit api of StreamLoad and commit TxnID to complete the
> >>>> visibility of the transaction.
> >>>> 3. When the task is restarted, the Txn with successful precommit and
> >>>> failed commit will be aborted based on the label-prefix, and Doris'
> abort
> >>>> API will be called. (At the same time, Doris will also abort
> transactions
> >>>> that have not been committed for a long time)
> >>>>
> >>>> ps: At the same time, this part of the content has been updated in
> FLIP
> >>>>
> >>>> - Because the default table model in Doris is Duplicate (
> >>>> https://doris.apache.org/docs/data-table/data-model/), which does not
> >>>> have a primary key, batch writing may cause data duplication, but
> UNIQ The
> >>>> model has a primary key, which ensures the idempotence of writing,
> thus
> >>>> achieving Exactly-Once
> >>>>
> >>>> Brs,
> >>>> di.wu
> >>>>
> >>>>
> >>>>> 2024年3月2日 17:50,Jeyhun Karimov <je.kari...@gmail.com> 写道:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Thanks for the proposal. +1 for the FLIP.
> >>>>> I have a few questions:
> >>>>>
> >>>>> - How exactly the two (Stream Load's two-phase commit and Flink's
> >>>> two-phase
> >>>>> commit) combination will ensure the e2e exactly-once semantics?
> >>>>>
> >>>>> - The FLIP proposes to combine Doris's batch writing with the
> primary key
> >>>>> table to achieve Exactly-Once semantics. Could you elaborate more on
> >>>> that?
> >>>>> Why it is not the default behavior but a workaround?
> >>>>>
> >>>>> Regards,
> >>>>> Jeyhun
> >>>>>
> >>>>> On Sat, Mar 2, 2024 at 10:14 AM Yanquan Lv <decq12y...@gmail.com>
> wrote:
> >>>>>
> >>>>>> Thanks for driving this.
> >>>>>> The content is very detailed, it is recommended to add a section on
> Test
> >>>>>> Plan for more completeness.
> >>>>>>
> >>>>>> Di Wu <d...@apache.org> 于2024年1月25日周四 15:40写道:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> Previously, we had some discussions about contributing Flink Doris
> >>>>>>> Connector to the Flink community [1]. I want to further promote
> this
> >>>>>> work.
> >>>>>>> I hope everyone will help participate in this FLIP discussion and
> >>>> provide
> >>>>>>> more valuable opinions and suggestions.
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> [1]
> https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p
> >>>>>>>
> >>>>>>> Brs,
> >>>>>>> di.wu
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 2023/12/07 05:02:46 wudi wrote:
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> As discussed in the previous email [1], about contributing the
> Flink
> >>>>>>> Doris Connector to the Flink community.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Apache Doris[2] is a high-performance, real-time analytical
> database
> >>>>>>> based on MPP architecture, for scenarios where Flink is used for
> data
> >>>>>>> analysis, processing, or real-time writing on Doris, Flink Doris
> >>>>>> Connector
> >>>>>>> is an effective tool.
> >>>>>>>>
> >>>>>>>> At the same time, Contributing Flink Doris Connector to the Flink
> >>>>>>> community will further expand the Flink Connectors ecosystem.
> >>>>>>>>
> >>>>>>>> So I would like to start an official discussion FLIP-399: Flink
> >>>>>>> Connector Doris[3].
> >>>>>>>>
> >>>>>>>> Looking forward to comments, feedbacks and suggestions from the
> >>>>>>> community on the proposal.
> >>>>>>>>
> >>>>>>>> [1]
> https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p
> >>>>>>>> [2]
> >>>>>>
> https://doris.apache.org/docs/dev/get-starting/what-is-apache-doris/
> >>>>>>>> [3]
> >>>>>>>
> >>>>>>
> >>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Brs,
> >>>>>>>>
> >>>>>>>> di.wu
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >
> >
>
>

Reply via email to