Abacn commented on code in PR #32170:
URL: https://github.com/apache/beam/pull/32170#discussion_r1722172933
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaExactlyOnceSink.java:
##########
@@ -524,6 +526,20 @@ void commitTxn(long lastRecordId, Counter numTransactions)
throws IOException {
ProducerSpEL.commitTransaction(producer);
numTransactions.inc();
+ if (!reportedLineage) {
+ Lineage.getSinks()
+ .add(
+ "kafka",
+ ImmutableList.of(
Review Comment:
The "ImmutableList" is a list of segments, the `.add(String system, List
segments)` call reports one lineage string. I did some refactoring in #32090 in
order to handle special characters in segments, see
https://github.com/apache/beam/pull/32090#issuecomment-2274138958
-----
If the question is asking about report lineage multiple times, here the
reportedLineage flag is a simple way to deduplicate reporting same metrics too
often. Because Lineage is a Set, report multiple times does not harm, just adds
some overhead.
Here each ShardWriter instance always run on same topic (spec is a
constructor argument, and the topic it writes is `spec.getTopic()`) so a
boolean flag suffices.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]