[
https://issues.apache.org/jira/browse/KUDU-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011135#comment-18011135
]
ASF subversion and git services commented on KUDU-3662:
-------------------------------------------------------
Commit 017185b2dff0346a28fe72c5ec9589eddde878bd in kudu's branch
refs/heads/master from Marton Greber
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=017185b2d ]
KUDU-3662 [4/n] Add reader & writer config parsing
Adds configuration parsing for Kudu Source and Kudu Sink so that
replication jobs can be tuned without code changes.
* ReplicationEnvProvider now consumes the parsed configs.
* ReplicationTestBase auto-generates default reader/writer configs.
* Adds two unit tests in replication_config_parser_test to verify
correct deserialization and application of defaults.
Change-Id: I48633a52046c7b5e637786d8e3c72d89946dc3e9
Reviewed-on: http://gerrit.cloudera.org:8080/23121
Tested-by: Marton Greber <[email protected]>
Reviewed-by: Gabriella Lotz <[email protected]>
Reviewed-by: Zoltan Chovan <[email protected]>
Reviewed-by: Ashwani Raina <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
Reviewed-by: Zoltan Martonka <[email protected]>
> Flink based continuous replication
> ----------------------------------
>
> Key: KUDU-3662
> URL: https://issues.apache.org/jira/browse/KUDU-3662
> Project: Kudu
> Issue Type: New Feature
> Reporter: Marton Greber
> Priority: Major
>
> Goal:
> Implement a Flink job that continuously reads from one Kudu cluster and
> writes to a sink Kudu cluster.
> Prerequisites:
> Previously there existed only a Flink Kudu sink connector. With the release
> of flink-connector-kudu 2.0 we developed a source connector that has a
> continuous unbounded mode, that utilises diffscan to read from Kudu.
> (https://github.com/apache/flink-connector-kudu/pull/8)
> The above prerequisite is now available.
> Design:
> A high level design doc has been already sent out to the mailing list:
> https://docs.google.com/document/d/1oaAn_cOY7aKth0C6MbNXgKU3R-PYols-V4got-_gpDk/edit?usp=sharing
> Development:
> - The Flink based Kudu replication job would live in the Kudu java project.
> (similar how the backup and restore Spark job is)
> - We need to create a Flink job that utilises the Flink Kudu source and sink
> implementations.
> - Provide CLI interface to be able to pipe down all the necessary reader and
> writer configs.
> - Create a table initialiser that can re-create the source table schema and
> partitioning schema on the sink cluster, if desired. (this is a convenience
> feature, the provide easier setup)
> - The Flink Kudu source is at this time missing metrics. In order to avoid
> waiting another release cycle, we can just create a wrapped source inside our
> project. We will contribute those metrics back to the flink-connector-kudu
> repo, and then we can just remove this intermediary logic from the job.
> - Write unit and integration tests.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)