Silun Dong created CALCITE-6846:
-----------------------------------

             Summary: Support basic dphyp join reorder algorithm
                 Key: CALCITE-6846
                 URL: https://issues.apache.org/jira/browse/CALCITE-6846
             Project: Calcite
          Issue Type: New Feature
          Components: core
    Affects Versions: 1.38.0
            Reporter: Silun Dong
            Assignee: Silun Dong


Supports the basic dphyp join reorder algorithm.
For example :
{code:java}
SELECT
    i_item_id
FROM store_sales, customer_demographics, date_dim, item, promotion
WHERE ss_sold_date_sk = d_date_sk AND
    ss_item_sk = i_item_sk AND
    ss_cdemo_sk = cd_demo_sk AND
    ss_promo_sk = p_promo_sk {code}
The plan tree after pushing down filter :
{code:java}
LogicalProject(i_item_id=[$61])
  LogicalJoin(condition=[=($7, $82)], joinType=[inner])
    LogicalJoin(condition=[=($1, $60)], joinType=[inner])
      LogicalJoin(condition=[=($22, $32)], joinType=[inner])
        LogicalJoin(condition=[=($3, $23)], joinType=[inner])
          LogicalTableScan(table=[[tpcds, store_sales]])
          LogicalTableScan(table=[[tpcds, customer_demographics]])
        LogicalTableScan(table=[[tpcds, date_dim]])
      LogicalTableScan(table=[[tpcds, item]])
    LogicalTableScan(table=[[tpcds, promotion]]){code}
Convert Joins into one HyperGraph :
{code:java}
LogicalProject(i_item_id=[$61])
  
HyperGraph(edges=[{0}——INNER——{1},{0}——INNER——{2},{0}——INNER——{3},{0}——INNER——{4}])
    LogicalTableScan(table=[[tpcds, store_sales]])
    LogicalTableScan(table=[[tpcds, customer_demographics]])
    LogicalTableScan(table=[[tpcds, date_dim]])
    LogicalTableScan(table=[[tpcds, item]])
    LogicalTableScan(table=[[tpcds, promotion]]) {code}
After dphyp join reorder (with trimming fields and pushing down Project), the 
plan is :
{code:java}
LogicalProject(i_item_id=[$1])
  LogicalJoin(condition=[=($0, $2)], joinType=[inner])
    LogicalProject(ss_cdemo_sk=[$0], i_item_id=[$2])
      LogicalJoin(condition=[=($1, $3)], joinType=[inner])
        LogicalProject(ss_cdemo_sk=[$1], ss_sold_date_sk=[$2], i_item_id=[$4])
          LogicalJoin(condition=[=($0, $3)], joinType=[inner])
            LogicalProject(ss_item_sk=[$0], ss_cdemo_sk=[$1], 
ss_sold_date_sk=[$3])
              LogicalJoin(condition=[=($2, $4)], joinType=[inner])
                LogicalProject(ss_item_sk=[$1], ss_cdemo_sk=[$3], 
ss_promo_sk=[$7], ss_sold_date_sk=[$22])
                  LogicalTableScan(table=[[tpcds, store_sales]])
                LogicalProject(p_promo_sk=[$0])
                  LogicalTableScan(table=[[tpcds, promotion]])
            LogicalProject(i_item_sk=[$0], i_item_id=[$1])
              LogicalTableScan(table=[[tpcds, item]])
        LogicalProject(d_date_sk=[$0])
          LogicalTableScan(table=[[tpcds, date_dim]])
    LogicalProject(cd_demo_sk=[$0])
      LogicalTableScan(table=[[tpcds, customer_demographics]]) {code}
The main enumeration process of dphyp will be implemented in pr. However, it 
only can process inner join for now and the simplification of hypergraph has 
not yet been implemented.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to