[ https://issues.apache.org/jira/browse/IMPALA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Sinha reassigned IMPALA-988: --------------------------------- Assignee: Aman Sinha > Join strategy (broadcast vs shuffle) decision does not take memory > consumption and other joins into account > ----------------------------------------------------------------------------------------------------------- > > Key: IMPALA-988 > URL: https://issues.apache.org/jira/browse/IMPALA-988 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 1.2.1 > Reporter: Alan Choi > Assignee: Aman Sinha > Priority: Minor > Labels: resource-management > > The amount of available memory changes the trade-off between partitioned and > shuffle join strategies: if switching to shuffle join can avoid spilling to > disk, it may be worth paying the cost of the additional network transfer. > There are two issues: > 1. Join strategy decision only takes query mem-limit into account but ignore > process mem-limit. > 2. Join strategy decision does not take other joins of the same query into > account. When multiple joins are present, memory consumption can be very high. > I ([~tarmstr...@cloudera.com]) don't think we should attempt to fix #1 - > there's a phase ordering problem here - we currently choose the > best-performing plan then decide how much memory to allocate in admission > control based on that plan. We can't preserve that while attempting to change > the plan to fit the mem_limit. That said, I think the current heuristic is a > little too aggressive about picking broadcast when the right side is very > large - it should probably bias more towards shuffle as the right side gets > larger. > Note that when IMPALA-3200 is completed, this shouldn't prevent the query > running to completion, but still affects performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org