[ 
https://issues.apache.org/jira/browse/CALCITE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6468:
-------------------------------
    Description: 
The problem can be reproduced with this query (a "simplified" version of TPC-DS 
query1):
{code:sql}
WITH agg_sal AS
  (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno)
SELECT 1 FROM agg_sal s1
WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno)
{code}

If we apply subquery program, FilterAggregateTransposeRule and then we call the 
RelDecorrelator, it will fail with:
{noformat}
java.lang.AssertionError
        at 
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581)
        at 
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
        ...
{noformat}

The problem appears in this assert (RelDecorrelator.java:581):
{code}
assert newPos == newInputOutput.size();
{code}

The root cause seems to be that, a few lines before, when processing the 
correlating variables from {{corDefOutputs}} a certain value is inserted in 
{{mapNewInputToProjOutputs}}:
{code}
if (!frame.corDefOutputs.isEmpty()) {
  for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
    RexInputRef.add2(projects, entry.getValue(), newInputOutput);
    corDefOutputs.put(entry.getKey(), newPos);
    mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE
    newPos++;
  }
}
{code}

The problem is that this value was already in the map, as it had been inserted 
previously as part of the group key processing:
{code}
for (int i = 0; i < oldGroupKeyCount; i++) {
  final int idx = groupKeyIndices.get(i);
  ...
  // add mapping of group keys.
  outputMap.put(idx, newPos);
  int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx));
  RexInputRef.add2(projects, newInputPos, newInputOutput);
  mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly
  newPos++;
}
{code}

Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and the 
subsequent increment of {{newPos}} when the {{CorDef}}s are processed leads to 
the mismatch.

Notice how, right before the assertion, when processing the remaining fields, 
it is verified that the value is not already contained on the 
{{mapNewInputToProjOutputs}}:
{code}
// add the remaining fields
final int newGroupKeyCount = newPos;
for (int i = 0; i < newInputOutput.size(); i++) {
  if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked
    RexInputRef.add2(projects, i, newInputOutput);
    mapNewInputToProjOutputs.put(i, newPos);
    newPos++;
  }
}
{code}

Thus, probably the solution would be to apply the same logic when the CorDef 
are processed:
{code}
if (!frame.corDefOutputs.isEmpty()) {
  for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
    Integer pos = mapNewInputToProjOutputs.get(entry.getValue()); // <-- HERE 
verify map
    if (pos == null) {
      RexInputRef.add2(projects, entry.getValue(), newInputOutput);
      corDefOutputs.put(entry.getKey(), newPos);
      mapNewInputToProjOutputs.put(entry.getValue(), newPos);
      newPos++;
    } else {
      corDefOutputs.put(entry.getKey(), pos);
    }
  }
}
{code}

  was:
The problem can be reproduced with this query (a "simplified" version of TPC-DS 
query1):
{code:sql}
WITH agg_sal AS
  (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno)
SELECT 1 FROM agg_sal s1
WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno)
{code}

If we apply subquery program, FilterAggregateTransposeRule and then we call the 
RelDecorrelator, it will fail with:
{noformat}
java.lang.AssertionError
        at 
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581)
        at 
org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
        ...
{noformat}

The problem appears in this assert (RelDecorrelator.java:581):
{code}
assert newPos == newInputOutput.size();
{code}

The root cause seems to be that, a few lines before, when processing the 
correlating variables from {{corDefOutputs}} a certain value is inserted in 
{{mapNewInputToProjOutputs}}:
{code}
if (!frame.corDefOutputs.isEmpty()) {
  for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
    RexInputRef.add2(projects, entry.getValue(), newInputOutput);
    corDefOutputs.put(entry.getKey(), newPos);
    mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE
    newPos++;
  }
}
{code}

The problem is that this value was already in the map, as it had been inserted 
previously as part of the group key processing:
{code}
for (int i = 0; i < oldGroupKeyCount; i++) {
  final int idx = groupKeyIndices.get(i);
  ...
  // add mapping of group keys.
  outputMap.put(idx, newPos);
  int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx));
  RexInputRef.add2(projects, newInputPos, newInputOutput);
  mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly
  newPos++;
}
{code}

Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and the 
subsequent increment of {{newPos}} when the {{CorDef}}s are processed leads to 
the mismatch.

Notice how, right before the assertion, when processing the remaining fields, 
it is verified that the value is not already contained on the 
{{mapNewInputToProjOutputs}}:
{code}
// add the remaining fields
final int newGroupKeyCount = newPos;
for (int i = 0; i < newInputOutput.size(); i++) {
  if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked
    RexInputRef.add2(projects, i, newInputOutput);
    mapNewInputToProjOutputs.put(i, newPos);
    newPos++;
  }
}
{code}

Thus, probably the solution would be to apply the same logic when the CorDef 
are processed:
{code}
if (!frame.corDefOutputs.isEmpty()) {
  for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
    Integer pos = mapNewInputToProjOutputs.get(entry.getValue());
    if (pos == null) {
      RexInputRef.add2(projects, entry.getValue(), newInputOutput);
      corDefOutputs.put(entry.getKey(), newPos);
      mapNewInputToProjOutputs.put(entry.getValue(), newPos);
      newPos++;
    } else {
      corDefOutputs.put(entry.getKey(), pos);
    }
  }
}
{code}


> RelDecorrelator throws AssertionError if correlated variable is used as 
> Aggregate group key
> -------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-6468
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6468
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.37.0
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>             Fix For: 1.38.0
>
>
> The problem can be reproduced with this query (a "simplified" version of 
> TPC-DS query1):
> {code:sql}
> WITH agg_sal AS
>   (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno)
> SELECT 1 FROM agg_sal s1
> WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = 
> s2.deptno)
> {code}
> If we apply subquery program, FilterAggregateTransposeRule and then we call 
> the RelDecorrelator, it will fail with:
> {noformat}
> java.lang.AssertionError
>       at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581)
>       at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
>       ...
> {noformat}
> The problem appears in this assert (RelDecorrelator.java:581):
> {code}
> assert newPos == newInputOutput.size();
> {code}
> The root cause seems to be that, a few lines before, when processing the 
> correlating variables from {{corDefOutputs}} a certain value is inserted in 
> {{mapNewInputToProjOutputs}}:
> {code}
> if (!frame.corDefOutputs.isEmpty()) {
>   for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
>     RexInputRef.add2(projects, entry.getValue(), newInputOutput);
>     corDefOutputs.put(entry.getKey(), newPos);
>     mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE
>     newPos++;
>   }
> }
> {code}
> The problem is that this value was already in the map, as it had been 
> inserted previously as part of the group key processing:
> {code}
> for (int i = 0; i < oldGroupKeyCount; i++) {
>   final int idx = groupKeyIndices.get(i);
>   ...
>   // add mapping of group keys.
>   outputMap.put(idx, newPos);
>   int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx));
>   RexInputRef.add2(projects, newInputPos, newInputOutput);
>   mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly
>   newPos++;
> }
> {code}
> Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and 
> the subsequent increment of {{newPos}} when the {{CorDef}}s are processed 
> leads to the mismatch.
> Notice how, right before the assertion, when processing the remaining fields, 
> it is verified that the value is not already contained on the 
> {{mapNewInputToProjOutputs}}:
> {code}
> // add the remaining fields
> final int newGroupKeyCount = newPos;
> for (int i = 0; i < newInputOutput.size(); i++) {
>   if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked
>     RexInputRef.add2(projects, i, newInputOutput);
>     mapNewInputToProjOutputs.put(i, newPos);
>     newPos++;
>   }
> }
> {code}
> Thus, probably the solution would be to apply the same logic when the CorDef 
> are processed:
> {code}
> if (!frame.corDefOutputs.isEmpty()) {
>   for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
>     Integer pos = mapNewInputToProjOutputs.get(entry.getValue()); // <-- HERE 
> verify map
>     if (pos == null) {
>       RexInputRef.add2(projects, entry.getValue(), newInputOutput);
>       corDefOutputs.put(entry.getKey(), newPos);
>       mapNewInputToProjOutputs.put(entry.getValue(), newPos);
>       newPos++;
>     } else {
>       corDefOutputs.put(entry.getKey(), pos);
>     }
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to