rohitsinha54 commented on code in PR #32170:
URL: https://github.com/apache/beam/pull/32170#discussion_r1722290171


##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedSource.java:
##########
@@ -74,6 +81,7 @@ public List<KafkaUnboundedSource<K, V>> split(int 
desiredNumSplits, PipelineOpti
             if (pattern.matcher(entry.getKey()).matches()) {
               for (PartitionInfo p : entry.getValue()) {
                 partitions.add(new TopicPartition(p.topic(), p.partition()));
+                Lineage.getSources().add("kafka", 
ImmutableList.of(bootStrapServers, p.topic()));

Review Comment:
   For future: It will be interesting to see how FQN will represent Google 
Managed Kafka vs Apache Kafka if they do represent them separately we will have 
to in future to some work to handle it them correctly. 
   
   



##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaWriter.java:
##########
@@ -91,6 +93,19 @@ public void processElement(ProcessContext ctx, 
MultiOutputReceiver receiver) thr
               callback);
 
       elementsWritten.inc();
+      if (!topicName.equals(reportedLineage)) {

Review Comment:
   Why are we adding this check do we expect to see more than one topic name or 
it is just to not redo this branch. If we expect more than one are they 
guaranteed to be appear in together? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to