Sungwoo Park created HIVE-27375: ----------------------------------- Summary: SharedWorkOptimizer assigns a common cache key to MapJoin operators that should not share MapJoin tables Key: HIVE-27375 URL: https://issues.apache.org/jira/browse/HIVE-27375 Project: Hive Issue Type: Bug Reporter: Sungwoo Park
When hive.optimize.shared.work.mapjoin.cache.reuse is set to true, SharedWorkOptimizer sometimes assigns a common cache key to MapJoin operators that should not share MapJoin tables. This bug occurs only for MapJoin operators with 3 or more parent operators. Example: MAPJOIN[575] (RS_83, GBY_66, RS_85) MAPJOIN[585] (RS_212, RS_213, GBY_210) In this example, both MAPJOIN[575] and MAPJOIN[585] have three parent operators. The current implementation assigns a common cache key to MAPJOIN[575] and MAPJOIN[585] because RS_83 are RS_212 are equivalent. However, MAPJOIN[575] uses GBY_66 for its big table whereas MAPJOIN[585] uses GBY_210 for its big table. As a result, the MapJoin table loaded by one operator cannot be used by the other. -- This message was sent by Atlassian Jira (v8.20.10#820010)