Hi people,
I am doing some benchmarking with Calcite for the sql-api in Apache Wayang that
requires typically multiconditional joins to be split into "binary" joins ala:
LogicalJoin(condition=[AND(=($0, $27), =($10, $28), =($34, $2))],
joinType=[inner]): rowcount = 118.65234375, cumulative cost = 1038.96484375
LogicalJoin(condition=[=($0, $11)], joinType=[inner]): rowcount
= 351.5625, cumulative cost = 820.3125
LogicalJoin(condition=[=($0, $3)], joinType=[inner]):
rowcount = 93.75, cumulative cost = 343.75
LogicalFilter(condition=[SEARCH($1, Sarg['cs':CHAR(11),
'gaming':CHAR(11), 'mathematica']:CHAR(11))]): rowcount = 25.0, cumulative cost
= 125.0
LogicalTableScan(table=[[postgres, site]]): rowcount =
100.0, cumulative cost = 100.0
LogicalFilter(condition=[SEARCH($6, Sarg[[10..100000]])]):
rowcount = 25.0, cumulative cost = 125.0
LogicalTableScan(table=[[postgres, so_user]]): rowcount =
100.0, cumulative cost = 100.0
LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])]):
rowcount = 25.0, cumulative cost = 125.0
LogicalTableScan(table=[[postgres, question]]): rowcount =
100.0, cumulative cost = 100.0
LogicalTableScan(table=[[postgres, answer]]): rowcount = 100.0,
cumulative cost = 100.0
BinaryJoin(condition=[=($60, $2)], joinType=[inner])
BinaryJoin(condition=[=($10, $41)], joinType=[inner])
BinaryJoin(condition=[=($0, $27)], joinType=[inner])
LogicalJoin(condition=[=($0, $11)], joinType=[inner])
LogicalJoin(condition=[=($0, $3)], joinType=[inner])
LogicalFilter(condition=[SEARCH($1, Sarg['cs':CHAR(11),
'gaming':CHAR(11), 'mathematica']:CHAR(11))])
LogicalTableScan(table=[[postgres, site]])
LogicalFilter(condition=[SEARCH($6, Sarg[[10..100000]])])
LogicalTableScan(table=[[postgres, so_user]])
LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])])
LogicalTableScan(table=[[postgres, question]])
LogicalTableScan(table=[[postgres, answer]])
LogicalTableScan(table=[[postgres, answer]])
LogicalTableScan(table=[[postgres, answer]])
Does anyone know of a Calcite rule that already does something like this, or
have a general idea about how such a thing would be implemented? I tried using
the hep-planner with a rules-based approach, but there are some issues with how
Wayang handles join inputs i.e. left and right, and Calcite handles inputs -
Calcite uses more a crosstype based on both the rows of the left and right
input. Thanks