[ https://issues.apache.org/jira/browse/IMPALA-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Taras Bobrovytsky resolved IMPALA-6388. --------------------------------------- Resolution: Fixed Fix Version/s: Impala 2.12.0 {noformat} commit f8b406222de8f41765ef1d130e2debbd8ab06369 Author: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Date: Thu Jan 11 17:01:07 2018 -0800 IMPALA-6388: Fix the Union node number of hosts estimation Before this patch, we would estimate the number of hosts for the union node by looking only at the first union operand. This is obviously incorrect and lead us to underestimate the value. We fix the problem by setting the estimate to be the maximum of its children. Testing: - Added a planner test that reproduces the issue Change-Id: I51e1ecca8dbc84b2b5a72708667b2799d00279f0 Reviewed-on: http://gerrit.cloudera.org:8080/9017 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins {noformat} > UnionNode sets the number of nodes incorrectly > ---------------------------------------------- > > Key: IMPALA-6388 > URL: https://issues.apache.org/jira/browse/IMPALA-6388 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, > Impala 2.9.0, Impala 2.10.0, Impala 2.11.0 > Reporter: Alexander Behm > Assignee: Taras Bobrovytsky > Priority: Critical > Labels: planner > Fix For: Impala 2.12.0 > > > The UnionNode plan node incorrectly sets the number of nodes based on its > first child. An inaccurate number of nodes can lead to bad planning > decisions, e.g. wrong join order or strategy. > A better policy would be to set the number of nodes based on the max nodes > over all the union's children. That number might still underestimate the real > number of nodes, but significantly less so. > Getting a more accurate estimate would involve keeping track of the actual > list of hosts in all plan nodes. Let's focus on the simpler solution outlined > above first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)