[ https://issues.apache.org/jira/browse/IMPALA-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Rogers reassigned IMPALA-8218: ----------------------------------- Assignee: (was: Paul Rogers) > Use "simple bin model" to estimate M:N, FK join cardinality > ----------------------------------------------------------- > > Key: IMPALA-8218 > URL: https://issues.apache.org/jira/browse/IMPALA-8218 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 3.1.0 > Reporter: Paul Rogers > Priority: Minor > > When computing join cardinality, the planner must determine how much a > filters reduce the cardinality of join keys. For example, in a M:1 (FK/PK) > join, filtering on the left (M, FK) side will reduce the number of rows > available to join. How much does that filtering reduce the keys? > The "selectivity" of a filter is the probability that any one row will pass > through the filter: > {noformat} > |T'| = |T| * sel(f) > sel(f) = |T'| / |T| > {noformat} > Where: > * \{{|T|}} is the cardinality of some table or relation T. > * \{{|T'|}} is the cardinality of the new relation T' after filtering. > * \{{sel(f)}} is the selectivity of some filter f. > The current model makes the standard assumption that the NDV of key columns > reduces by the same amount as the table cardinality. That is: > {noformat} > |T.k'| = |T.k| * sel(f) > {noformat} > This assumption is incorrect as explained below. The correct expression is > explained in Swami & Schiefer (S&S), [On the Estimation of Join Result > Sizes|https://pdfs.semanticscholar.org/2735/672262940a23cfce8d4cc1e25c0191de7da7.pdf] > (S&S), section 5. > h4. Motivation > It is easiest to reason about these issues in the M:1 (FK/PK) case, but the > logic also applies to both sides of the M:N (many-to-many, generic) case. > Consider an example. We have a M:1 relationship with two tables: D (for > detail) and M (for master). A M:1 relationship exists: > {noformat} > D.fk --> M.pk > {noformat} > The foreign key (fk) in the detail table points to the primary key (pk) in > the master table. Note that keys in the master table are unique, so the > simple filtering expression works fine: > {noformat} > |M.pk'| = |M.pk| * sel(f) > {noformat} > On the detail side, there are multiple rows that have the same key. What we > want to know is, when we apply the filtering above, how many of the foreign > keys survive? Since there are multiple keys, the answer is not obvious. To > see, this, let's assume that the detail table has 1000 rows, and uses only > one FK value. If we select 500 rows, how many foreign keys are left? > Obviously, the answer is that there is still one FK value. > h4. Correlated Filtering > There is one case where simple filtering is the right answer: if the filter > is on the key column itself. Suppose we know that the left (detail) scan > applied a filter on the foreign key column. Maybe \{{D.fk = 123}}. In this > case (as explained in the S&S paper above) we know that \{{|D.fk'| = 1}}. In > the general case, we may have any operator (!=, <, etc.) in the filter so the > cardinality of the foreign key is simply the result of applying that filter: > {noformat} > |D.fk'| = |D.fk| * sel(f) > {noformat} > h4. Uncorrelated Filtering > In the next case, we filter on columns other than the primary key column. For > simplicity, let us adopt the uniformity assumption from S&S paper: that > applying a filter on a column other than the key results in a random sampling > of the key column. As explained in S&S, section 5, we must use the "simple > urn model": > {noformat} > |T.k'| = urn( |T.k|, |T'| ) > url(c, n) = c * (1 - (1 - 1/c)^n) > {noformat} > Where: > * \{{urn(c, n)}} is the urn model expression which gives the estimated > cardinality of a column as the result of a scan > * \{{c}} is the cardinality of a key column > * \{{n}} is the cardinality of the output relation of a scan > See the paper for the reasoning behind this expression. > h4. Combined Filtering > To make the above work, we observe that, in relational theory, we get the > same result whether we apply all filters in one go, or apply them one-by-one. > We get the same result if we apply the filters during the scan (as Impala > does), or by first scanning all rows, then applying the filters afterward (as > some other engines do.) > This observation allows us to break the calculation into two parts: > * Determine the key NDV and table cardinality produced by just the correlated > filters. > * Use the urn model to predict key cardinality from the uncorrelated filters. > We start by sorting scan filters into two categories: > * Correlated filters applied to a given key column (here, \{{D.fk}}). > * All other filters (the uncorrelated filters.) > We do this by first gathering all predicates applied to the detail (left) > relation. The detail for doing so is explained in IMPALA-8217 which describes > the need to adjust for filters applied to both sides of a join. Given that > set, we can do the following: > * Remove from the set all those predicates which reference \{{D.fk}}. These > are the correlated predicates. > * Remove correlated filtering from the scan cardinality to get the > cardinality due just to the uncorrelated predicates. > * Apply correlated filtering to the key column NDV to get the reduced NDV > after filtering. > * Apply the urn model to compute key cardinality after uncorrelated filtering. > Suppose we have the following: > * \{{F(D)}}: the set of all filters applied to the scan of D: \{{(f1(D), > f2(D), ... fn(D))}}. > * \{{F(D.fk)}}: the subset of the above that apply to only the key column > \{{D.fk}}. > Then we can compute or define the selectivity of these filters: > {noformat} > sel(scan(D)) = ∏ sel(Fi(D)) > = |D'| / |D| > sel(F(D.fk)) = ∏ sel(Fi(D.fk)) > {noformat} > Next we can work out the cardinalities required for inputs to the urn model: > {noformat} > |D.fk''| = |D.fk| * sel(F(D.fk)) > |D''| = |D| * sel(F(D.fk)) > = |D'| / sel(scan(D) * sel(F(D.fk)) > {noformat} > Where: > * \{{|D.fk''|}} is the key cardinality obtained by applying just the > correlated filters. > * \{{|D''|}} is the scan cardinality that would result if we applied only the > correlated filters. > Then, we can apply the uncorrelated filtering using the urn model: > {noformat} > |D.fk'| = urn( |D.fk''|, |D''| ) > {noformat} > h4. Compound Keys > The discussion above is for simple keys. If a key is compound (see > IMPALA-8014) we can refine the above as follows: > * Compute each column in the key as described above. > * Multiply the adjusted NDV's to get the cardinality of the key as a whole, > adjusting as explained in IMPALA-8014. > h4. Complication: Compound Joins > The calculation above is sound, but is complicated by several factors in > Impala's implementation. First is that the left side of a join is usually not > a table; it is usually another join. (This is a result of the left-deep > pattern adopted by most query optimizers.) Because of this, we actually do > not know the original table cardinality which we've called \{{|D|}}. Yes, we > do know the size of the tables that went into a join, and we know the size of > the relation that comes out of the join, but there is no reasonable number > for the base table since there are multiple. > Instead, the above calculations have been based on what we do know: > * \{{|T'|}} the result of all filtering (and joins). This is the cardinality > input to the present join. > * \{{F}}, the set of all filters applied anywhere in the subtree below a join > input. > * \{{|D.fk|}}, the key column NDV as reported from HMS stats. > Using just these values we can calculate key cardinality as: > {noformat} > |D.fk''| = |D.fk| * sel(F(D.fk)) > |D''| = |D'| / sel(scan(D) * sel(F(D.fk)) > |D.fk'| = urn( |D.fk''|, |D''| ) > {noformat} > h4. Tracking Adjusted NDV > The above recomputes adjusted NDV at each join from first principals. It is > possible to simplify the task by working only one level at a time. Each scan > or join would maintain a running adjusted NDV for each column: > * The scan node starts with the input NDVs as reported by HMS. > * The scan node computes output NDVs by applying filters to each column and > tracking the result. > * The join node uses the scan output NDVs as |T.k''| (the NDV after > correlated filtering.) > * The join node tracks its own output NDVs that result from applying > uncorrelated filtering. > With this approach, changes bubble up the operator tree one step at a time. > Debugging is easier since we can visualize the adjusted NDVs. > In the approach described in above sections, we track the combined set of all > filters applied up the tree. With the approach described in this section, we > track the result of applying the filters rather than the filters themselves. > h4. Complication: Non-Linear Filter Combination > The approach above assumes we can apply filters in any order which is a > foundational assumption of relational theory. Unfortunately, Impala takes a > different approach: it uses an exponential back-off: > {noformat} > |T'| = |T| * ∏(i =0..) Fi^i > {noformat} > This means that the filters cannot be applied in any order, nor can we easily > back one filter out of the combined filter. > Impala does this to compensate, in part, for the fact that Impala does not > compute selectivity for any but simple \{{col = const}} predicate. > To allow the calculations above, we must remove the exponential back-off, > which requires computing selectivity for all predicates -- something that is > a good idea anyway. See IMPALA-8217. > h4. Generalizing to the M:N (Generic) Case > The discussion above discusses foreign keys in a M:1 join. The same logic > applies to both sides of a join in a M:N (many-to-many, "generic") join. > Calculations for the right-side table are a bit simpler because, in Impala, > the right input is always a base table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org