[ 
https://issues.apache.org/jira/browse/IMPALA-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned IMPALA-8218:
-----------------------------------

    Assignee:     (was: Paul Rogers)

> Use "simple bin model" to estimate M:N, FK join cardinality
> -----------------------------------------------------------
>
>                 Key: IMPALA-8218
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8218
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> When computing join cardinality, the planner must determine how much a 
> filters reduce the cardinality of join keys. For example, in a M:1 (FK/PK) 
> join, filtering on the left (M, FK) side will reduce the number of rows 
> available to join. How much does that filtering reduce the keys?
> The "selectivity" of a filter is the probability that any one row will pass 
> through the filter:
> {noformat}
> |T'| = |T| * sel(f)
> sel(f) = |T'| / |T|
> {noformat}
> Where:
> * \{{|T|}} is the cardinality of some table or relation T.
> * \{{|T'|}} is the cardinality of the new relation T' after filtering.
> * \{{sel(f)}} is the selectivity of some filter f.
> The current model makes the standard assumption that the NDV of key columns 
> reduces by the same amount as the table cardinality. That is:
> {noformat}
> |T.k'| = |T.k| * sel(f)
> {noformat}
> This assumption is incorrect as explained below. The correct expression is 
> explained in Swami & Schiefer (S&S), [On the Estimation of Join Result 
> Sizes|https://pdfs.semanticscholar.org/2735/672262940a23cfce8d4cc1e25c0191de7da7.pdf]
>  (S&S), section 5.
> h4. Motivation
> It is easiest to reason about these issues in the M:1 (FK/PK) case, but the 
> logic also applies to both sides of the M:N (many-to-many, generic) case.
> Consider an example. We have a M:1 relationship with two tables: D (for 
> detail) and M (for master). A M:1 relationship exists:
> {noformat}
> D.fk --> M.pk
> {noformat}
> The foreign key (fk) in the detail table points to the primary key (pk) in 
> the master table. Note that keys in the master table are unique, so the 
> simple filtering expression works fine:
> {noformat}
> |M.pk'| = |M.pk| * sel(f)
> {noformat}
> On the detail side, there are multiple rows that have the same key. What we 
> want to know is, when we apply the filtering above, how many of the foreign 
> keys survive? Since there are multiple keys, the answer is not obvious. To 
> see, this, let's assume that the detail table has 1000 rows, and uses only 
> one FK value. If we select 500 rows, how many foreign keys are left? 
> Obviously, the answer is that there is still one FK value.
> h4. Correlated Filtering
> There is one case where simple filtering is the right answer: if the filter 
> is on the key column itself. Suppose we know that the left (detail) scan 
> applied a filter on the foreign key column. Maybe \{{D.fk = 123}}. In this 
> case (as explained in the S&S paper above) we know that \{{|D.fk'| = 1}}. In 
> the general case, we may have any operator (!=, <, etc.) in the filter so the 
> cardinality of the foreign key is simply the result of applying that filter:
> {noformat}
> |D.fk'| = |D.fk| * sel(f)
> {noformat}
> h4. Uncorrelated Filtering
> In the next case, we filter on columns other than the primary key column. For 
> simplicity, let us adopt the uniformity assumption from S&S paper: that 
> applying a filter on a column other than the key results in a random sampling 
> of the key column. As explained in S&S, section 5, we must use the "simple 
> urn model":
> {noformat}
> |T.k'| = urn( |T.k|, |T'| )
> url(c, n) = c * (1 - (1 - 1/c)^n)
> {noformat}
> Where:
> * \{{urn(c, n)}} is the urn model expression which gives the estimated 
> cardinality of a column as the result of a scan
> * \{{c}} is the cardinality of a key column
> * \{{n}} is the cardinality of the output relation of a scan
> See the paper for the reasoning behind this expression.
> h4. Combined Filtering
> To make the above work, we observe that, in relational theory, we get the 
> same result whether we apply all filters in one go, or apply them one-by-one. 
> We get the same result if we apply the filters during the scan (as Impala 
> does), or by first scanning all rows, then applying the filters afterward (as 
> some other engines do.)
> This observation allows us to break the calculation into two parts:
> * Determine the key NDV and table cardinality produced by just the correlated 
> filters.
> * Use the urn model to predict key cardinality from the uncorrelated filters.
> We start by sorting scan filters into two categories:
> * Correlated filters applied to a given key column (here, \{{D.fk}}).
> * All other filters (the uncorrelated filters.)
> We do this by first gathering all predicates applied to the detail (left) 
> relation. The detail for doing so is explained in IMPALA-8217 which describes 
> the need to adjust for filters applied to both sides of a join. Given that 
> set, we can do the following:
> * Remove from the set all those predicates which reference \{{D.fk}}. These 
> are the correlated predicates.
> * Remove correlated filtering from the scan cardinality to get the 
> cardinality due just to the uncorrelated predicates.
> * Apply correlated filtering to the key column NDV to get the reduced NDV 
> after filtering.
> * Apply the urn model to compute key cardinality after uncorrelated filtering.
> Suppose we have the following:
> * \{{F(D)}}: the set of all filters applied to the scan of D: \{{(f1(D), 
> f2(D), ... fn(D))}}.
> * \{{F(D.fk)}}: the subset of the above that apply to only the key column 
> \{{D.fk}}.
> Then we can compute or define the selectivity of these filters:
> {noformat}
> sel(scan(D)) = ∏ sel(Fi(D))
> = |D'| / |D|
> sel(F(D.fk)) = ∏ sel(Fi(D.fk))
> {noformat}
> Next we can work out the cardinalities required for inputs to the urn model:
> {noformat}
> |D.fk''| = |D.fk| * sel(F(D.fk))
> |D''| = |D| * sel(F(D.fk))
> = |D'| / sel(scan(D) * sel(F(D.fk))
> {noformat}
> Where:
> * \{{|D.fk''|}} is the key cardinality obtained by applying just the 
> correlated filters.
> * \{{|D''|}} is the scan cardinality that would result if we applied only the 
> correlated filters.
> Then, we can apply the uncorrelated filtering using the urn model:
> {noformat}
> |D.fk'| = urn( |D.fk''|, |D''| )
> {noformat}
> h4. Compound Keys
> The discussion above is for simple keys. If a key is compound (see 
> IMPALA-8014) we can refine the above as follows:
> * Compute each column in the key as described above.
> * Multiply the adjusted NDV's to get the cardinality of the key as a whole, 
> adjusting as explained in IMPALA-8014.
> h4. Complication: Compound Joins
> The calculation above is sound, but is complicated by several factors in 
> Impala's implementation. First is that the left side of a join is usually not 
> a table; it is usually another join. (This is a result of the left-deep 
> pattern adopted by most query optimizers.) Because of this, we actually do 
> not know the original table cardinality which we've called \{{|D|}}. Yes, we 
> do know the size of the tables that went into a join, and we know the size of 
> the relation that comes out of the join, but there is no reasonable number 
> for the base table since there are multiple.
> Instead, the above calculations have been based on what we do know:
> * \{{|T'|}} the result of all filtering (and joins). This is the cardinality 
> input to the present join.
> * \{{F}}, the set of all filters applied anywhere in the subtree below a join 
> input.
> * \{{|D.fk|}}, the key column NDV as reported from HMS stats.
> Using just these values we can calculate key cardinality as:
> {noformat}
> |D.fk''| = |D.fk| * sel(F(D.fk))
> |D''| = |D'| / sel(scan(D) * sel(F(D.fk))
> |D.fk'| = urn( |D.fk''|, |D''| )
> {noformat}
> h4. Tracking Adjusted NDV
> The above recomputes adjusted NDV at each join from first principals. It is 
> possible to simplify the task by working only one level at a time. Each scan 
> or join would maintain a running adjusted NDV for each column:
> * The scan node starts with the input NDVs as reported by HMS.
> * The scan node computes output NDVs by applying filters to each column and 
> tracking the result.
> * The join node uses the scan output NDVs as |T.k''| (the NDV after 
> correlated filtering.)
> * The join node tracks its own output NDVs that result from applying 
> uncorrelated filtering.
> With this approach, changes bubble up the operator tree one step at a time. 
> Debugging is easier since we can visualize the adjusted NDVs.
> In the approach described in above sections, we track the combined set of all 
> filters applied up the tree. With the approach described in this section, we 
> track the result of applying the filters rather than the filters themselves.
> h4. Complication: Non-Linear Filter Combination
> The approach above assumes we can apply filters in any order which is a 
> foundational assumption of relational theory. Unfortunately, Impala takes a 
> different approach: it uses an exponential back-off:
> {noformat}
> |T'| = |T| * ∏(i =0..) Fi^i
> {noformat}
> This means that the filters cannot be applied in any order, nor can we easily 
> back one filter out of the combined filter.
> Impala does this to compensate, in part, for the fact that Impala does not 
> compute selectivity for any but simple \{{col = const}} predicate.
> To allow the calculations above, we must remove the exponential back-off, 
> which requires computing selectivity for all predicates -- something that is 
> a good idea anyway. See IMPALA-8217.
> h4. Generalizing to the M:N (Generic) Case
> The discussion above discusses foreign keys in a M:1 join. The same logic 
> applies to both sides of a join in a M:N (many-to-many, "generic") join. 
> Calculations for the right-side table are a bit simpler because, in Impala, 
> the right input is always a base table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to