guozhangwang commented on code in PR #12555:
URL: https://github.com/apache/kafka/pull/12555#discussion_r959844687


##########
streams/src/main/java/org/apache/kafka/streams/kstream/internals/InternalStreamsBuilder.java:
##########
@@ -270,16 +272,20 @@ private void maybeAddNodeForOptimizationMetadata(final 
GraphNode node) {
 
     // use this method for testing only
     public void buildAndOptimizeTopology() {
-        buildAndOptimizeTopology(false);
+        buildAndOptimizeTopology(false, false);
     }
 
-    public void buildAndOptimizeTopology(final boolean optimizeTopology) {
+    public void buildAndOptimizeTopology(
+        final boolean optimizeTopology, final boolean optimizeSelfJoin) {
 
         mergeDuplicateSourceNodes();
         if (optimizeTopology) {
             LOG.debug("Optimizing the Kafka Streams graph for repartition 
nodes");
             optimizeKTableSourceTopics();
             maybeOptimizeRepartitionOperations();
+            if (optimizeSelfJoin) {

Review Comment:
   Thanks @vpapavas , I think I start to understand some of your motivations 
here, but just to make sure I do, could we go over a list of examples below:
   
   1.
   ```
   stream1 = builder.stream("topic1");
   stream1.join(stream1)
   ```
   This case is optimizable, and just condition #2 alone should be sufficient 
to validate optimization to be applied.
   
   2.
   ```
   stream1 = builder.stream("topic1");
   stream2 = builder.stream("topic1"); // same topic
   stream1.join(stream2)
   ```
   This case is optimizable, but since from the logical plan we still have two 
graph nodes we cannot validate the optimization.
   
   3.
   ```
   stream1 = builder.stream("topic1");
   streams1.mapValues(v -> v);
   stream2 = builder.stream("topic1"); // same topic
   stream1.join(stream2)
   ```
   This case is optimizable, but it seems our validation would exclude it?
   
   4.
   ```
   stream1 = builder.stream("topic1");
   stream2 = builder.stream("topic2");
   stream3 = stream1.merge(stream2);
   stream3.join(stream3);
   ```
   This case should be optimizable, but it seems our validation would exclude 
it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to