ngsg commented on code in PR #4043:
URL: https://github.com/apache/hive/pull/4043#discussion_r1390737001


##########
ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query2.q.out:
##########
@@ -11,7 +11,7 @@ STAGE PLANS:
         Map 11 <- Map 6 (BROADCAST_EDGE), Union 12 (CONTAINS)
         Map 14 <- Map 6 (BROADCAST_EDGE), Union 12 (CONTAINS)
         Map 5 <- Map 6 (BROADCAST_EDGE), Union 2 (CONTAINS)
-        Map 6 <- Reducer 8 (BROADCAST_EDGE), Reducer 9 (BROADCAST_EDGE)
+        Map 6 <- Reducer 10 (BROADCAST_EDGE), Reducer 8 (BROADCAST_EDGE), 
Reducer 9 (BROADCAST_EDGE)

Review Comment:
   Current ParallelEdgeFixer does not update RuntimeValueInformation(RVI) 
correctly. Because TezCompiler creates SemiJoin edges based on RVI, this issue 
leads to absence of some edges.
   
   The edge between Map6 and Reducer10 is one of the disappeared edge. After 
SWO, Map6 has 2 incoming SemiJoin edges that come from the same reducer. So PEF 
inserts SEL-RS in order to prevent parallel edge, but it does not update RVI of 
the parent of the inserted SEL-RS. That's why previous plan does not contain an 
edge between Map6 and Reducer10.
   
   I attached 3 operator graphs for the sake of your better understanding. All 
graphs are generated during TPCDS30TB-query2 test.
   Before applying PEF:
   <img width="395" alt="master before dot 1" 
src="https://github.com/apache/hive/assets/29757139/0a096c15-6286-48d6-a348-e7201b4a127d";>
   
   After applying current PEF:
   <img width="454" alt="master after dot 1" 
src="https://github.com/apache/hive/assets/29757139/50d11e34-3c94-4812-b5dd-6b7804899b4b";>
   
   After applying modified PEF:
   <img width="264" alt="h27006 after dot 1" 
src="https://github.com/apache/hive/assets/29757139/58e2f472-7dfb-4688-b639-883fe9b8acff";>
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to