Nirmal Govindaraj created CALCITE-7191:
------------------------------------------
Summary: Hypergraph creation with incorrect hyperedges
Key: CALCITE-7191
URL: https://issues.apache.org/jira/browse/CALCITE-7191
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Reporter: Nirmal Govindaraj
This is about the JOIN_TO_HYPERGRAPH rule introduced in CALCITE-6846
When I tested the following query in RelOptRulesTest.java I encountered an
assertion error in the planAfter stage where HYPER_GRAPH_OPTIMIZE gets executed
{code:java}
HepProgram program = new
HepProgramBuilder().addMatchOrder(HepMatchOrder.BOTTOM_UP)
.addRuleInstance(CoreRules.JOIN_PROJECT_RIGHT_TRANSPOSE)
.addRuleInstance(CoreRules.JOIN_PROJECT_LEFT_TRANSPOSE)
.addRuleInstance(CoreRules.PROJECT_MERGE)
.addRuleInstance(CoreRules.PROJECT_REMOVE)
.addRuleInstance(CoreRules.JOIN_TO_HYPER_GRAPH)
.build();
String innerJoinSql = "select a.ename, bc.name from bonus a inner join (select
b.empno, b.ename, c.name from emp b inner join dept c on b.deptno = c.deptno)
bc on a.ename = bc.ename";
sql(innerJoinSql).withPre(program)
.withRule(CoreRules.HYPER_GRAPH_OPTIMIZE, CoreRules.PROJECT_MERGE)
.check();{code}
{code:java}
java.lang.AssertionError at
org.apache.calcite.rel.rules.HyperGraph$1.visitNodeAndFieldIndex(HyperGraph.java:437)
at
org.apache.calcite.rel.rules.HyperGraph$1.visitNodeAndFieldIndex(HyperGraph.java:432)
at
org.apache.calcite.rex.RexNodeAndFieldIndex.accept(RexNodeAndFieldIndex.java:78)
at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:167) at
org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:119) at
org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:38) at
org.apache.calcite.rex.RexCall.accept(RexCall.java:208) at
org.apache.calcite.rel.rules.HyperGraph.extractJoinCond(HyperGraph.java:443)
{code}
On debugging further I noticed that the edges were not correct in the
hypergraph it created
{code:java}
HyperGraph(edges=[{0}——[INNER, =(node(0)_field(0),
node(1)_field(1))]——{1},{1}——[INNER, =(node(0)_field(7),
node(2)_field(0))]——{2}])
LogicalTableScan(table=[[CATALOG, SALES, BONUS]]) -- nodeIndex 0
LogicalTableScan(table=[[CATALOG, SALES, EMP]]) -- nodeIndex 1
LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) -- nodeIndex 2
{code}
The second edge should be from EMP table so node(1)_field(7), node(2)_field(0).
On debugging the JOIN_TO_HYPERGRAPH rule I found that the issue was in the
adjustNodeBit method of HyperEdge.java. Essentially what happens is that when
we reach the BONUS inner join we have a Hypergraph on EMP and DEPT on the right
side of the tree so the node indices of the hyperedges need to be shifted by 1
{code:java}
public HyperEdge adjustNodeBit(int nodeOffset) {
RexShuttle shiftNodeIndexShuttle = new RexShuttle() {
@Override public RexNode visitNodeAndFieldIndex(RexNodeAndFieldIndex
nodeAndFieldIndex) {
return new RexNodeAndFieldIndex(
nodeAndFieldIndex.getNodeIndex() << nodeOffset,
nodeAndFieldIndex.getFieldIndex(),
nodeAndFieldIndex.getName(),
nodeAndFieldIndex.getType());
}
};
RexNode shiftedCondition = condition.accept(shiftNodeIndexShuttle);
List<ConflictRule> shiftedConflictRules = new ArrayList<>();
for (ConflictRule conflictRule : conflictRules) {
shiftedConflictRules.add(conflictRule.shift(nodeOffset));
}
return new HyperEdge(
tes.shift(nodeOffset),
shiftedConflictRules,
leftNodeUsedInPredicate << nodeOffset,
rightNodeUsedInPredicate << nodeOffset,
initialLeftNodeBits << nodeOffset,
initialRightNodeBits << nodeOffset,
joinType,
shiftedCondition);
} {code}
But in the code it does a leftshift by 1 instead
(nodeAndFieldIndex.getNodeIndex() << nodeOffset in shiftNodeIndexShuttle). So
the hyperedge between EMP and DEPT which was originally between node 0 and node
1 turns into node 0 and node 2
Changing the code to nodeAndFieldIndex.getNodeIndex() + nodeOffset in
shiftNodeIndexShuttle fixes the issue
--
This message was sent by Atlassian Jira
(v8.20.10#820010)