Mihai Budiu created CALCITE-7092:
------------------------------------
Summary: DPhyp implementation assertion failure
Key: CALCITE-7092
URL: https://issues.apache.org/jira/browse/CALCITE-7092
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.40.0
Reporter: Mihai Budiu
This is about the hypergraph-based join optimization algorithm introduced in
[CALCITE-6846] and used in the optimization rule HYPER_GRAPH_OPTIMIZE.
I have encountered the following error:
{code}
set type is RecordType(DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL EXPR$0, BIGINT
NOT NULL EXPR$1) NOT NULL
expression type is RecordType(BIGINT NOT NULL EXPR$1, DECIMAL(38, 2) EXPR$2,
BIGINT NOT NULL EXPR$0) NOT NULL
set is
rel#212:HyperGraph.(input#0=HepRelVertex#216,input#1=HepRelVertex#206,edges={0}——[INNER,
true]——{1})
expression is LogicalJoin(condition=[true], joinType=[inner])
LogicalProject(EXPR$1=[$0])
LogicalAggregate(group=[{}], EXPR$1=[COUNT($0)])
LogicalAggregate(group=[{0}])
LogicalProject(id=[$1])
LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0],
$f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2),
0.00:DECIMAL(5, 2))):DECIMAL(5, 2)])
LogicalTableScan(table=[[schema, t]])
LogicalProject(EXPR$2=[$0], EXPR$0=[$1])
HyperGraph(edges=[{0}——[INNER, true]——{1}])
LogicalAggregate(group=[{}], EXPR$2=[SUM($2)])
LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0],
$f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2),
0.00:DECIMAL(5, 2))):DECIMAL(5, 2)])
LogicalTableScan(table=[[schema, t]])
LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
LogicalAggregate(group=[{0}])
LogicalProject($f0=[$0])
LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0],
$f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2),
0.00:DECIMAL(5, 2))):DECIMAL(5, 2)])
LogicalTableScan(table=[[schema, t]])
Type mismatch:
rowtype of original rel: RecordType(DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL
EXPR$0, BIGINT NOT NULL EXPR$1) NOT NULL
rowtype of new rel: RecordType(BIGINT NOT NULL EXPR$1, DECIMAL(38, 2) EXPR$2,
BIGINT NOT NULL EXPR$0) NOT NULL
Difference:
EXPR$2: DECIMAL(38, 2) -> BIGINT NOT NULL
EXPR$0: BIGINT NOT NULL -> DECIMAL(38, 2)
{code}
This happens in our compiler after optimizing a query through several other
optimizations, and by using a different cost model for DPhyp, so it may be
tricky to post an exact reproduction. A query that can trigger it is:
{code:sql}
CREATE TABLE T(id INT, od INT, val DECIMAL(5, 2), ct INT, e BOOLEAN);
SELECT
COUNT(DISTINCT CASE WHEN od = 1 THEN id END),
COUNT(DISTINCT id),
SUM(CASE WHEN (od = 0 AND e) THEN ROUND(val, 0) ELSE 0.0
END)
FROM T
{code}
It looks like the code that permutes fields in the result to reconstruct the
original output order is incorrect.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)