[jira] [Created] (KUDU-3662) Flink based continuous replication

Marton Greber (Jira) Sat, 26 Apr 2025 06:30:06 -0700

Marton Greber created KUDU-3662:
-----------------------------------

             Summary: Flink based continuous replication
                 Key: KUDU-3662
                 URL: https://issues.apache.org/jira/browse/KUDU-3662
             Project: Kudu
          Issue Type: New Feature
            Reporter: Marton Greber

Goal:
Implement a Flink job that continuously reads from one Kudu cluster and writes
to a sink Kudu cluster.

Prerequisites:
Previously there existed only a Flink Kudu sink connector. With the release of
flink-connector-kudu 2.0 we developed a source connector that has a continuous
unbounded mode, that utilises diffscan to read from Kudu.
(https://github.com/apache/flink-connector-kudu/pull/8)

The above prerequisite is now available.

Design:
A high level design doc has been already sent out to the mailing list:
https://docs.google.com/document/d/1oaAn_cOY7aKth0C6MbNXgKU3R-PYols-V4got-_gpDk/edit?usp=sharing

Development:
- The Flink based Kudu replication job would live in the Kudu java project.
(similar how the backup and restore Spark job is)
- We need to create a Flink job that utilises the Flink Kudu source and sink
implementations.
- Provide CLI interface to be able to pipe down all the necessary reader and
writer configs.
- Create a table initialiser that can re-create the source table schema and
partitioning schema on the sink cluster, if desired. (this is a convenience
feature, the provide easier setup)
- The Flink Kudu source is at this time missing metrics. In order to avoid
waiting another release cycle, we can just create a wrapped source inside our
project. We will contribute those metrics back to the flink-connector-kudu
repo, and then we can just remove this intermediary logic from the job.
- Write unit and integration tests.

--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KUDU-3662) Flink based continuous replication

Reply via email to