GitHub user jianqiao opened a pull request:
https://github.com/apache/incubator-quickstep/pull/174
Push down disjunctive predicates to filter small-cardinality stored
relation early.
This PR implements an optimization (physical plan transformation) that
pushes down disjunctive predicate to filter stored relations early when proper
conditions are met.
Here we elaborate the conditions. Let
```
P = p_{1,1} AND ... AND p_{1, m_1}
OR
...
OR
p_{n,1} AND ... AND p_{n, m_n}
```
be a predicate in _disjunctive normal form_.
Now consider each small-cardinality relation R, if for each `i` in `1..n`,
there exists at least one predicate `p_{i, k_i}` that is applicable to R. Then
we can construct a new predicate
```
P' = p_{1, k_1} OR ... OR p_{n, k_n}
```
and push down `P'` to be applied to R.
Also, if any conjunctive component in `P` contains more than one predicate
that is applicable to R, then we can combine all these applicable predicates as
a conjunctive component in `P'`.
Finally, note that if there exists a conjunctive component that contains no
predicate applicable to R. Then the condition fails and we cannot do a push
down for R.
This optimization improves the performance of TPC-H Q07 from ~17 seconds to
~4 seconds, with SF100 on a cloudlab machine.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-quickstep
push-down-disjunctive-predicate
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-quickstep/pull/174.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #174
----
commit 6a184c8ad981343dd50f04c76d3937bcdce34ddc
Author: Jianqiao Zhu <[email protected]>
Date: 2017-01-30T07:02:19Z
Push down low cost disjunctive predicates to filter the stored relations
early
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---