[
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashutosh Chauhan updated HIVE-6041:
-----------------------------------
Resolution: Fixed
Fix Version/s: 0.13.0
Status: Resolved (was: Patch Available)
Committed to trunk. Thanks, Navis!
> Incorrect task dependency graph for skewed join optimization
> ------------------------------------------------------------
>
> Key: HIVE-6041
> URL: https://issues.apache.org/jira/browse/HIVE-6041
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
> Environment: Hadoop 1.0.3
> Reporter: Adrian Popescu
> Assignee: Navis
> Priority: Critical
> Fix For: 0.13.0
>
> Attachments: HIVE-6041.1.patch.txt
>
>
> The dependency graph among task stages is incorrect for the skewed join
> optimized plan. Skewed joins are enabled through "hive.optimize.skewjoin".
> For the case that skewed keys do not exist, all the tasks following the
> common join are filtered out at runtime.
> In particular, the conditional task in the optimized plan maintains no
> dependency with the child tasks of the common join task in the original plan.
> The conditional task is composed of the map join task which maintains all
> these dependencies, but for the case the map join task is filtered out (i.e.,
> no skewed keys exist), all these dependencies are lost. Hence, all the other
> task stages of the query (e.g., move stage which writes down the results into
> the result table) are skipped.
> The bug resides in "ql/optimizer/physical/GenMRSkewJoinProcessor.java",
> processSkewJoin() function, immediately after the ConditionalTask is created
> and its dependencies are set.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)