A Beam SQL user found an issue where we are discarding their output field
names that appears to be related to a change to the VolcanoPlanner to treat
rel nodes that only differ by field names as equivalent:
https://github.com/apache/calcite/commit/1e9b4da0573ec73d332d4e65fb7fd30491b4318d#diff-008c6d52bfd93bbe963a23c264bc412c68cac3b4837e3f10b8d5e4858cd4acb8L1136

This creates an interaction between CalcMergeRule and the VolcanoPlanner
when there are two equivalence sets with a Calc that only differs by output
column name. The sets are merged and the output column rename is lost. This
occurs with both Calcite's LogicalCalc and BeamCalc with a series of three
mergeable Calc nodes.

The minimal reproduction I've derived in Beam involves running the planner
with CoreRules.FILTER_TO_CALC, CoreRules.PROJECT_TO_CALC,
CoreRules.CALC_MERGE, and BeamCalcRule.INSTANCE using this test query:
WITH tempTable (id, v) AS (SELECT f_int as id, f_string as v FROM
PCOLLECTION) SELECT id AS fout_int, v AS fout_string FROM tempTable WHERE
id >= 1

It creates a logical plan like this:
    LogicalProject(fout_int=[$0], fout_string=[$1])
      LogicalFilter(condition=[>=($0, 1)])
        LogicalProject(id=[$0], v=[$1])
          BeamIOSourceRel(table=[[beam, PCOLLECTION]])

And the planner creates a physical plan like this (with what started as the
top LogicalProject dropped):
    BeamCalcRel(expr#0..1=[{inputs}], expr#2=[1], expr#3=[>=($t0, $t2)],
proj#0..1=[{exprs}], $condition=[$t3])
      BeamIOSourceRel(table=[[beam, PCOLLECTION]])

I was able to work around it by limiting calc merging rules to our BeamCalc
and adding field names back to the equivalency check there.
https://github.com/apache/beam/pull/25290/files#diff-ead622461b5c25264d0c680fcacde454ff457b8c05dc73164cafd298573f56bcR58

This seems like a bug to me but we've previously been told that Calcite
doesn't promise to retain field names so I assume that Calcite doesn't
consider this a bug? See
https://issues.apache.org/jira/browse/CALCITE-1584?focusedCommentId=16031351&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16031351

Reply via email to