Abacn commented on code in PR #32170:
URL: https://github.com/apache/beam/pull/32170#discussion_r1722172933


##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaExactlyOnceSink.java:
##########
@@ -524,6 +526,20 @@ void commitTxn(long lastRecordId, Counter numTransactions) 
throws IOException {
           ProducerSpEL.commitTransaction(producer);
 
           numTransactions.inc();
+          if (!reportedLineage) {
+            Lineage.getSinks()
+                .add(
+                    "kafka",
+                    ImmutableList.of(

Review Comment:
   The "ImmutableList" is a list of segments, the `.add(String system, List 
segments)` call reports one lineage string. I did some refactoring in #32090 in 
order to handle special characters in segments, see 
https://github.com/apache/beam/pull/32090#issuecomment-2274138958
   
   -----
   
   If the question is asking about report lineage multiple times,  here the 
reportedLineage flag is a simple way to deduplicate reporting same metrics too 
often. Because Lineage is a Set, report multiple times does not harm, just adds 
some overhead.
   
   Here each ShardWriter instance always run on same topic (spec is a 
constructor argument, and the topic it writes is `spec.getTopic()`) so a 
boolean flag suffices.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to