[ 
https://issues.apache.org/jira/browse/HIVE-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-20489:
-----------------------------------
    Description: 
Explain on a query that joins 47 views, in effect around 94 joins after view 
expansion seems to take forever.  The case here tries to generate a plan using 
map join with conditional tasks.

When the task graph is huge with many paths, there can be a performance issue 
during compilation.  This is caused by recursive traversal of task graph in 
internTableDesc and deriveFinalExplainAttributes.  The use of recursion is 
inefficient in a couple of ways.
* For large graphs the recursion was filling up the stack
* Instead of finding the map works, the traversal was walking all possible 
paths from root causing a huge performance problem.

The fix is to replace the traversal from recursive to an iterative one, keeping 
track of the nodes already visited.  The fix uses getMRTasks, getSparkTasks and 
getTezTasks to do iterative traversal.  These calls were changed to using 
iterative calls through HIVE-17195.  When pushing this patch to an older 
release, please make sure HIVE-17195 is also pushed to that release. 


  was:Explain on a query that joins 47 views, in effect around 94 joins after 
view expansion seems to take forever. 


> Explain plan of query hangs
> ---------------------------
>
>                 Key: HIVE-20489
>                 URL: https://issues.apache.org/jira/browse/HIVE-20489
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Janaki Lahorani
>            Assignee: Janaki Lahorani
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20489.1.patch, HIVE-20489.2.patch, 
> HIVE-20489.3.patch, HIVE-20489.4.patch
>
>
> Explain on a query that joins 47 views, in effect around 94 joins after view 
> expansion seems to take forever.  The case here tries to generate a plan 
> using map join with conditional tasks.
> When the task graph is huge with many paths, there can be a performance issue 
> during compilation.  This is caused by recursive traversal of task graph in 
> internTableDesc and deriveFinalExplainAttributes.  The use of recursion is 
> inefficient in a couple of ways.
> * For large graphs the recursion was filling up the stack
> * Instead of finding the map works, the traversal was walking all possible 
> paths from root causing a huge performance problem.
> The fix is to replace the traversal from recursive to an iterative one, 
> keeping track of the nodes already visited.  The fix uses getMRTasks, 
> getSparkTasks and getTezTasks to do iterative traversal.  These calls were 
> changed to using iterative calls through HIVE-17195.  When pushing this patch 
> to an older release, please make sure HIVE-17195 is also pushed to that 
> release. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to