Silun Dong created CALCITE-6846:
-----------------------------------
Summary: Support basic dphyp join reorder algorithm
Key: CALCITE-6846
URL: https://issues.apache.org/jira/browse/CALCITE-6846
Project: Calcite
Issue Type: New Feature
Components: core
Affects Versions: 1.38.0
Reporter: Silun Dong
Assignee: Silun Dong
Supports the basic dphyp join reorder algorithm.
For example :
{code:java}
SELECT
i_item_id
FROM store_sales, customer_demographics, date_dim, item, promotion
WHERE ss_sold_date_sk = d_date_sk AND
ss_item_sk = i_item_sk AND
ss_cdemo_sk = cd_demo_sk AND
ss_promo_sk = p_promo_sk {code}
The plan tree after pushing down filter :
{code:java}
LogicalProject(i_item_id=[$61])
LogicalJoin(condition=[=($7, $82)], joinType=[inner])
LogicalJoin(condition=[=($1, $60)], joinType=[inner])
LogicalJoin(condition=[=($22, $32)], joinType=[inner])
LogicalJoin(condition=[=($3, $23)], joinType=[inner])
LogicalTableScan(table=[[tpcds, store_sales]])
LogicalTableScan(table=[[tpcds, customer_demographics]])
LogicalTableScan(table=[[tpcds, date_dim]])
LogicalTableScan(table=[[tpcds, item]])
LogicalTableScan(table=[[tpcds, promotion]]){code}
Convert Joins into one HyperGraph :
{code:java}
LogicalProject(i_item_id=[$61])
HyperGraph(edges=[{0}——INNER——{1},{0}——INNER——{2},{0}——INNER——{3},{0}——INNER——{4}])
LogicalTableScan(table=[[tpcds, store_sales]])
LogicalTableScan(table=[[tpcds, customer_demographics]])
LogicalTableScan(table=[[tpcds, date_dim]])
LogicalTableScan(table=[[tpcds, item]])
LogicalTableScan(table=[[tpcds, promotion]]) {code}
After dphyp join reorder (with trimming fields and pushing down Project), the
plan is :
{code:java}
LogicalProject(i_item_id=[$1])
LogicalJoin(condition=[=($0, $2)], joinType=[inner])
LogicalProject(ss_cdemo_sk=[$0], i_item_id=[$2])
LogicalJoin(condition=[=($1, $3)], joinType=[inner])
LogicalProject(ss_cdemo_sk=[$1], ss_sold_date_sk=[$2], i_item_id=[$4])
LogicalJoin(condition=[=($0, $3)], joinType=[inner])
LogicalProject(ss_item_sk=[$0], ss_cdemo_sk=[$1],
ss_sold_date_sk=[$3])
LogicalJoin(condition=[=($2, $4)], joinType=[inner])
LogicalProject(ss_item_sk=[$1], ss_cdemo_sk=[$3],
ss_promo_sk=[$7], ss_sold_date_sk=[$22])
LogicalTableScan(table=[[tpcds, store_sales]])
LogicalProject(p_promo_sk=[$0])
LogicalTableScan(table=[[tpcds, promotion]])
LogicalProject(i_item_sk=[$0], i_item_id=[$1])
LogicalTableScan(table=[[tpcds, item]])
LogicalProject(d_date_sk=[$0])
LogicalTableScan(table=[[tpcds, date_dim]])
LogicalProject(cd_demo_sk=[$0])
LogicalTableScan(table=[[tpcds, customer_demographics]]) {code}
The main enumeration process of dphyp will be implemented in pr. However, it
only can process inner join for now and the simplification of hypergraph has
not yet been implemented.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)