Hi, Tejansh

Thanks for your work.

I have some thoughts regarding this proposed change, and there are a
few points that need to be emphasized:
First, since Flink SQL/Table API relies on RowData, it cannot handle
transactional metadata. Therefore, this feature can only be supported
in the DataStream API.
Second, because parsing this transactional information may introduce
performance overhead, it would be best to add an option to control
whether this parsing is enabled or not—and by default, it should be
disabled.
As I understand it, the current proposal only includes changes to the
CDC connector and does not cover the MySQL pipeline connector. If we
decide to support this feature in the MySQL pipeline connector in the
future, we can discuss that separately in another JIRA ticket.

Best,
Hang

On Tue, Nov 18, 2025 at 5:13 PM Tejansh Rana
<[email protected]> wrote:
>
> Hello,
>
> Bumping this thread.
> I have created this Jira ticket describing the proposal and based on Gunnar’s 
> feedback, I have also included the base Source Emitter which would cover this 
> feature for connectors like Postgres - 
> https://issues.apache.org/jira/browse/FLINK-38691
> This is the draft PR with proposed changes for MySQL connector - 
> https://github.com/apache/flink-cdc/pull/4170
>
> I would appreciate some feedback on this proposal and I would be happy to 
> contribute the feature.
>
> Thank you,
> Tejansh
>
> From: Tejansh Rana <[email protected]>
> Date: Tuesday, 4 November 2025 at 16:50
> To: [email protected] <[email protected]>
> Subject: Re: [PROPOSAL] Support for MySQL Transaction Boundary Events in 
> Flink CDC Connector
> [You don't often get email from [email protected]. Learn why 
> this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> EXTERNAL EMAIL : Do not click any links or open any attachments unless you 
> trust the sender and know the content is safe.
>
>
> Thank you, Gunnar!
> I have created a draft PR for the proposed feature - 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fpull%2F4170&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322279639%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=nJAvbS0TVL8FWX62mqD69pJyyb6PDzd3osZpPy9LawI%3D&reserved=0<https://github.com/apache/flink-cdc/pull/4170>
> Looking forward to hearing more feedback.
>
> Thank you,
> Tejansh
>
> From: Gunnar Morling <[email protected]>
> Date: Monday, 3 November 2025 at 17:19
> To: [email protected] <[email protected]>
> Subject: Re: [PROPOSAL] Support for MySQL Transaction Boundary Events in 
> Flink CDC Connector
> [You don't often get email from [email protected]. Learn 
> why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> EXTERNAL EMAIL : Do not click any links or open any attachments unless you 
> trust the sender and know the content is safe.
>
>
> Hey all,
>
> I'd love to see support for this! Coincidentally, I am just working on a
> PoC right now which uses the custom watermarks in the DataStream v2 API to
> represent transaction boundaries. It seems this is a great fit
> conceptually. In any case, it would be nice to not only support this for
> MySQL but also other DBs. Debezium provides that transaction metadata for a
> range of connectors, including Postgres.
>
> --Gunnar
>
>
> On Mon, 3 Nov 2025 at 15:53, Tejansh Rana <[email protected]>
> wrote:
>
> > Hello,
> >
> > Following up on the below proposal. Would appreciate your thoughts and if
> > we could move forward with this feature.
> >
> > Thank you,
> > Tejansh
> >
> > From: Tejansh Rana <[email protected]>
> > Date: Friday, 17 October 2025 at 15:58
> > To: [email protected] <[email protected]>
> > Subject: [PROPOSAL] Support for MySQL Transaction Boundary Events in Flink
> > CDC Connector
> > You don't often get email from [email protected]. Learn
> > why this is important<https://aka.ms/LearnAboutSenderIdentification>
> >
> > EXTERNAL EMAIL : Do not click any links or open any attachments unless you
> > trust the sender and know the content is safe.
> >
> > Hi team,
> >
> > Following my discussion with Leonard Xu at Flink Forward, I am writing to
> > propose a feature enhancement for the Flink MySQL CDC connector related to
> > how it handles transaction metadata from the MySQL binary log.
> >
> > Problem Statement:
> > In data streaming pipelines that require transactional guarantees or need
> > to group atomic changes together, it is essential to identify the
> > boundaries of the original database transaction (i.e., the BEGIN and COMMIT
> > or END events). Currently, the Flink MySQL CDC connector appears to skip
> > these transaction lifecycle events -
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fblob%2F23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660%2Fflink-cdc-connect%2Fflink-cdc-source-connectors%2Fflink-connector-mysql-cdc%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fcdc%2Fconnectors%2Fmysql%2Fsource%2Freader%2FMySqlRecordEmitter.java%23L77&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322297418%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6szuQvjDu7qdG8SVG7xKoZAgQo5zIQhBR25vsE%2F50UU%3D&reserved=0<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fblob%2F23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660%2Fflink-cdc-connect%2Fflink-cdc-source-connectors%2Fflink-connector-mysql-cdc%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fcdc%2Fconnectors%2Fmysql%2Fsource%2Freader%2FMySqlRecordEmitter.java%23L77&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322412677%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=APMT7%2FgNDJZFza8K9JnG1OvAW1z2qyYzEyoiTI8%2Btx0%3D&reserved=0><https://github.com/apache/flink-cdc/blob/23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/org/apache/flink/cdc/connectors/mysql/source/reader/MySqlRecordEmitter.java#L77>
> > .
> > I have also attached a screenshot of the logs from this  behaviour.
> >
> > This omission makes it challenging to reconstruct the original transaction
> > scope. Without explicit transaction markers, downstream Flink jobs cannot
> > easily guarantee atomicity across sinks.
> >
> > Proposed Solution:
> > The underlying CDC mechanism, Debezium, supports emitting transaction
> > boundary events (BEGIN and END/COMMIT) through its configuration.
> >
> > We propose enhancing the Flink MySQL CDC connector to expose this
> > transaction metadata to the Flink pipeline. The connector should emit
> > specialised records or metadata fields that indicate the start and end of a
> > transaction as emitted. We would be happy to create a PR with this feature
> > if this proposal goes ahead.
> >
> > Thank you,
> > Tejansh
> >
> >
> >

Reply via email to