Ruben Q L created CALCITE-6468:
----------------------------------
Summary: RelDecorrelator throws AssertionError if correlated
variable is used as Aggregate group key
Key: CALCITE-6468
URL: https://issues.apache.org/jira/browse/CALCITE-6468
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.37.0
Reporter: Ruben Q L
Assignee: Ruben Q L
Fix For: 1.38.0
The problem can be reproduced with this query (a "simplified" version of TPC-DS
query1):
{code:sql}
WITH agg_sal AS
(SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno)
SELECT 1 FROM agg_sal s1
WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno)
{code}
If we apply subquery program, FilterAggregateTransposeRule and then we call the
RelDecorrelator, it will fail with:
{noformat}
java.lang.AssertionError
at
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581)
at
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
...
{noformat}
The problem appears in this assert (RelDecorrelator.java:581):
{code}
assert newPos == newInputOutput.size();
{code}
The root cause seems to be that, a few lines before, when processing the
correlating variables from {{corDefOutputs}} a certain value is inserted in
{{mapNewInputToProjOutputs}}:
{code}
if (!frame.corDefOutputs.isEmpty()) {
for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
RexInputRef.add2(projects, entry.getValue(), newInputOutput);
corDefOutputs.put(entry.getKey(), newPos);
mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE
newPos++;
}
}
{code}
The problem is that this value was already in the map, as it had been inserted
previously as part of the group key processing:
{code}
for (int i = 0; i < oldGroupKeyCount; i++) {
final int idx = groupKeyIndices.get(i);
...
// add mapping of group keys.
outputMap.put(idx, newPos);
int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx));
RexInputRef.add2(projects, newInputPos, newInputOutput);
mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly
newPos++;
}
{code}
Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and the
subsequent increment of {{newPos}} when the {{CorDef}}s are processed leads to
the mismatch.
Notice how, right before the assertion, when processing the remaining fields,
it is verified that the value is not already contained on the
{{mapNewInputToProjOutputs}}:
{code}
// add the remaining fields
final int newGroupKeyCount = newPos;
for (int i = 0; i < newInputOutput.size(); i++) {
if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked
RexInputRef.add2(projects, i, newInputOutput);
mapNewInputToProjOutputs.put(i, newPos);
newPos++;
}
}
{code}
Thus, probably the solution would be to apply the same logic when the CorDef
are processed:
{code}
if (!frame.corDefOutputs.isEmpty()) {
for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
Integer pos = mapNewInputToProjOutputs.get(entry.getValue());
if (pos == null) {
RexInputRef.add2(projects, entry.getValue(), newInputOutput);
corDefOutputs.put(entry.getKey(), newPos);
mapNewInputToProjOutputs.put(entry.getValue(), newPos);
newPos++;
} else {
corDefOutputs.put(entry.getKey(), pos);
}
}
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)