[jira] [Comment Edited] (FLINK-36778) losing data when using rowid as the chunk key column in OracleIncrementalSource.java

shaohui hong (Jira) Wed, 11 Dec 2024 22:53:08 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-36778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905033#comment-17905033
 ]


shaohui hong edited comment on FLINK-36778 at 12/12/24 6:52 AM:
----------------------------------------------------------------

I can fix it


was (Author: JIRAUSER307825):
I can fixed it

> losing data when using rowid as the chunk key column in 
> OracleIncrementalSource.java
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-36778
>                 URL: https://issues.apache.org/jira/browse/FLINK-36778
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>    Affects Versions: 3.0.0
>            Reporter: shaohui hong
>            Priority: Major
>              Labels: Flink-CDC
>
> If a user does not specify chunk key column in OracleIncrementalSource, the 
> finkc-oracle-connector-cdc will choose rowid as chunk key column. Everything 
> is correct during snapshotting data, but thing comes wrong when it changes to 
> the phase of stream backfill.
> The data of a captured table is spllited to chunks using chunk key column. 
> There are four steps needed to process each snapshot chunk. The first is 
> determing low watermark,  the second is snapshotting data, the third is 
> determing high water mark, and the last is stream backfill. All the output 
> elements are put into a queue, and processed by the function named 
> pollSplitRecords defined in IncrementalSourceScanFetcher.java. The format of 
> the queue is as following:
> [low watermark event][snapshot events][high watermark event][change 
> events][end watermark event]
> The snapshot data will put into a map named outputBuffer, the key of which is 
> chunk key column name, and the value of which is a record in the captured 
> table. If rowid is used as chunk key column, the key of outputBuffer will be 
> rowid. At this situation, when stream backfill data is used to rewrite 
> outputBuffer, its key is null when the captured table does not define primary 
> key, or is formatted by primary keys defined in the captured table, which 
> leads to the failure of finding the key in outputBuffer to rewrite value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-36778) losing data when using rowid as the chunk key column in OracleIncrementalSource.java

Reply via email to