okumin commented on code in PR #5452:
URL: https://github.com/apache/hive/pull/5452#discussion_r1818846673


##########
ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java:
##########
@@ -217,7 +230,11 @@ private void link(Object o1, Object o2, boolean mayMerge) {
     }
     if (mGroups.size() > 1) {
       if (!mayMerge) {
-        throw new RuntimeException("equivalence mapping violation");
+        LOG.warn("Illegally linking {} and {}", o1, o2);
+        if (failsWithIllegalLink) {
+          throw new RuntimeException("equivalence mapping violation");
+        }
+        isBroken.set(true);

Review Comment:
   I checked the use cases querying PlanMapper.
   (1) 
[RuntimeStatsPersistenceCheckerHook](https://github.com/apache/hive/blob/b431e11eb19def7df978547bd161ba102d59083c/ql/src/java/org/apache/hadoop/hive/ql/hooks/RuntimeStatsPersistenceCheckerHook.java)
 iterates all signatures to test RuntimeStatsPersister can correctly 
encode/decode OpTreeSignature. Only test codes use this class, and this class 
probably works even if a signature has illegal links.
   (2) 
[StatsSources](https://github.com/apache/hive/blob/b431e11eb19def7df978547bd161ba102d59083c/ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/StatsSources.java)
 iterates all elements to get valid ones for the optimizer. This is likely to 
work even with illegal links as long as we attach `IncorrectRuntimeStatsMarker` 
to them.
   (3) 
[ReOptimizePlugin](https://github.com/apache/hive/blob/b431e11eb19def7df978547bd161ba102d59083c/ql/src/java/org/apache/hadoop/hive/ql/reexec/ReOptimizePlugin.java)
 iterates all Operators to identify whether the execution plan is changed or 
not. This logic probably works even if there are illegal links.
   
   I think both (1) and (3) don't rely on the existence of mapping violations. 
We have to take care of (2) because we may fill runtime stats of an unrelated 
Operator if a mapping violation is a real problem. Anyway, I'll try a PoC



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to