Sungwoo Park created HIVE-27375:
-----------------------------------

             Summary: SharedWorkOptimizer assigns a common cache key to MapJoin 
operators that should not share MapJoin tables
                 Key: HIVE-27375
                 URL: https://issues.apache.org/jira/browse/HIVE-27375
             Project: Hive
          Issue Type: Bug
            Reporter: Sungwoo Park


When hive.optimize.shared.work.mapjoin.cache.reuse is set to true, 
SharedWorkOptimizer sometimes assigns a common cache key to MapJoin operators 
that should not share MapJoin tables. This bug occurs only for MapJoin 
operators with 3 or more parent operators.

Example:
MAPJOIN[575] (RS_83, GBY_66, RS_85)
MAPJOIN[585] (RS_212, RS_213, GBY_210)

In this example, both MAPJOIN[575] and MAPJOIN[585] have three parent 
operators. The current implementation assigns a common cache key to 
MAPJOIN[575] and MAPJOIN[585] because RS_83 are RS_212 are equivalent.

However, MAPJOIN[575] uses GBY_66 for its big table whereas MAPJOIN[585] uses 
GBY_210 for its big table. As a result, the MapJoin table loaded by one 
operator cannot be used by the other.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to