Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/24139 )
Change subject: docs: add Flink replication guide ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc File docs/kudu_flink_replication.adoc: http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@49 PS1, Line 49: . All writes to the sink use upsert-ignore semantics, making replication idempotent and : compatible with at-least-once Flink checkpointing. IIUC, the replication job doesn't handle table's schema drift. Consider introducing an extra paragraph to provide a summary of existing/known limitations right away and refer to corresponding sections (e.g., 'Schema Change Handling') to how to handle each. Adding a separate 'Limitations' paragraph is one of the ways tracking those -- we do have at least a few doc entries following such a pattern. http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@67 PS1, Line 67: The replication job JAR (e.g. `kudu-replication-<version>.jar`). Is it a fat JAR without extra dependencies or some extra Kudu JARs are required in the Java classpath as well? http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@153 PS1, Line 153: | All other users | Sink tables | _(no insert/update/delete)_ : | Explicitly deny or omit write privileges What about permissions to perform various DDL operations on the target table? Does the current set of permissions prevent users from modifying the table's schema? http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@391 PS1, Line 391: |=== What about VARCHAR columns? http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@393 PS1, Line 393: The `ARRAY` column type is *not supported*. Tables containing array columns cannot be : replicated. Do we consider this as a temporary limitation? If so, maybe mention that to make clear it's a not a design deficiency but rather a limitation of current implementation, and the status of array column type support may change in the future. http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@420 PS1, Line 420: == Monitoring This seems to be a substantial chunk of information specific to monitoring and metrics. Maybe, move this to come before 'Troubleshooting' or maybe make this the very last chapter of this doc? http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@422 PS1, Line 422: A working reference monitoring stack (Prometheus + Grafana + json_exporter) is included in : `examples/flink-replication/monitoring/` Please also mention what project's repo this refers to. http://gerrit.cloudera.org:8080/#/c/24139/1/docs/kudu_flink_replication.adoc@766 PS1, Line 766: It would be nice to provide at least a minimal guidance on handling table renaming. What happens when the original table is renamed: should users expect errors or any unexpected behavior? Is there a risk of unexpected data loss at the destination cluster (e.g., if the new source table is empty)? -- To view, visit http://gerrit.cloudera.org:8080/24139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I684d608165af636bd4a799351926b68322469218 Gerrit-Change-Number: 24139 Gerrit-PatchSet: 1 Gerrit-Owner: Marton Greber <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Gabriella Lotz <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Fri, 27 Mar 2026 04:04:10 +0000 Gerrit-HasComments: Yes
