julianbradford19-png commented on issue #10641:
URL: https://github.com/apache/seatunnel/issues/10641#issuecomment-4151073241

   Thanks for the feedback! I agree on all points. For this PR, I’ll keep the
   MySQL-specific field names for accuracy, add a note about COALESCE usage
   for snapshots in the docs, and ensure both unit and E2E tests cover the
   null fallback logic. Will proceed with opening the PR
   
   On Sat, Mar 28, 2026, 10:13 AM David Zollo ***@***.***> wrote:
   
   > *davidzollo* left a comment (apache/seatunnel#10641)
   > <https://github.com/apache/seatunnel/issues/10641#issuecomment-4148240742>
   >
   > Hi @ricky2129 <https://github.com/ricky2129> , thanks for the incredibly
   > detailed design doc. This is indeed a critical pain point for the
   > Enterprise CDC-to-Data-Lake pipeline, and relying on strict transaction log
   > coordinates instead of EventTime is the absolute right way to go.
   >
   > Overall, your design looks very elegant and safe:
   >
   >    1. Putting these extra descriptors in SeaTunnelRow.options without
   >    interfering with core layout and serialization logic is correct and
   >    protects the checkpoint safety.
   >    2. Generating null gracefully for non-MySQL sources and keeping
   >    backward compatibility is fully acknowledged.
   >
   > A few points/suggestions before we proceed with the PR:
   >
   >    1. *About Field Naming Generalization:* Since different databases have
   >    different position terminologies (e.g., LSN for PostgreSQL, SCN for
   >    Oracle), maybe we could think abstractly whether we should use 
generalized
   >    names like LogFile, LogPos, or LogSequence if we want to make the
   >    Metadata transform generic across JDBC sources down the road. If you
   >    prefer sticking with explicit MySQL BinlogXX terminologies for this PR
   >    due to exactness, that is perfectly fine with me. We can document it
   >    specifically.
   >    2. *Snapshot Phase Null Handling:* As you mentioned, startup.mode for
   >    historical snapshots will produce null for these binlog positions.
   >    Just make sure we can provide a small best-practice note in the config
   >    documentation on how users should write their COALESCE query in
   >    downstream engines to correctly sequence snapshot data vs incremental
   >    binlog data.
   >    3. *E2E Testing:* Please ensure that we cover this new metadata
   >    extraction via both Unit Tests and a test case in our E2E module 
verifying
   >    both the value population and the null fallback logic.
   >
   > The design doc looks great to me. Feel free to assign this to yourself and
   > open a PR !
   >
   > —
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/seatunnel/issues/10641?email_source=notifications&email_token=B74KUGZDM5YQMGKPG3NIQG34S7T2PA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJUHAZDIMBXGQZKM4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#issuecomment-4148240742>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/B74KUG3FHB3MMUHUVXT53234S7T2PAVCNFSM6AAAAACW5RGIASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCNBYGI2DANZUGI>
   > .
   > You are receiving this because you are subscribed to this thread.Message
   > ID: ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to