Jim Hughes created FLINK-39558:
----------------------------------

             Summary: LogicalUnnestRule produces incorrect TableFunctionScan 
rowType, masking nullability and breaking LEFT JOIN UNNEST
                 Key: FLINK-39558
                 URL: https://issues.apache.org/jira/browse/FLINK-39558
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Planner
    Affects Versions: 2.1.1, 2.0.1
            Reporter: Jim Hughes


h2. Problem

{{LogicalUnnestRule}} converts {{Correlate(Uncollect)}} into 
{{Correlate(LogicalTableFunctionScan)}}. When deriving the TFS rowType, the 
rule round-trips through Flink's logical type system:

  Calcite RelDataType
    -> Flink LogicalType (via FlinkTypeFactory.toLogicalType)
    -> UnnestRowsFunctionBase.getUnnestedType(...)
    -> RowType wrapper
    -> Calcite RelDataType (via createFieldTypeFromLogicalType)

This round-trip drops per-field nullability that Calcite's upstream Correlate 
has derived, and synthesizes field names ({{f0}}, {{EXPR$0}}) instead of using 
Calcite's source-derived names. The most visible consequence is the {{LEFT JOIN 
UNNEST}} crash documented in FLINK-33217, but the round-trip is broader than 
that fix addresses.

h3. Reproducer

For a table with a nullable array of NOT-NULL elements:

{code:sql}
CREATE TABLE nested_not_null (
  business_data ARRAY<STRING NOT NULL>,
  nested ROW<`data` ARRAY<STRING NOT NULL>>,
  nested_array ARRAY<ROW<`data` ARRAY<STRING NOT NULL>> NOT NULL>
);

-- (a) Bare Uncollect under LEFT correlate
SELECT * FROM nested_not_null
  LEFT JOIN UNNEST(nested_not_null.business_data) AS exploded_bd ON TRUE;

-- (b) ON-predicate adds a Filter between Correlate and Uncollect
SELECT * FROM nested_not_null
  LEFT JOIN UNNEST(nested_not_null.business_data) AS exploded_bd
  ON exploded_bd <> 'debug';
{code}

Both queries fail compilation with:

{noformat}
java.lang.AssertionError: Cannot add expression of different type to set:
set type is RecordType(... VARCHAR(2147483647) business_data0) NOT NULL
expression type is RecordType(... VARCHAR(2147483647) NOT NULL f0) NOT NULL
Difference:
  business_data0: VARCHAR(2147483647) -> VARCHAR(2147483647) NOT NULL
{noformat}

h2. Root Cause

The TFS rowType produced by the rule does not match the rowType Calcite's 
Correlate derives for the original {{Correlate(Uncollect)}} tree. Specifically:

* Calcite's {{Uncollect.deriveUncollectRowType}} produces field names derived 
from the source array column (e.g. {{business_data0}}) and propagates container 
nullability.
* Flink's {{UnnestRowsFunctionBase.getUnnestedType}} returns just the element 
type, which is then wrapped into a synthetic {{RowType.of(...)}} with default 
field names ({{f0}}). The container's nullability is dropped.

When {{LogicalUnnestRule}} replaces the {{Uncollect}} with a 
{{LogicalTableFunctionScan}} carrying this synthesised rowType, the new 
{{Correlate}}'s derived rowType diverges from the original. 
{{RelOptUtil.verifyTypeEquivalence}} then fails.

h3. Why the previous fix is incomplete

FLINK-33217 added {{getLogicalProjectWithAdjustedNullability}}, which inserts 
{{CAST}}-to-nullable wrappers on each projection in the LEFT-correlate Project 
branch. This patches the symptom but leaves the underlying round-trip in place, 
so it covers only one of the four shapes the rule matches:

|| Shape || Pre-fix behavior ||
| {{Correlate(LEFT) -> Project(Uncollect)}} | Patched by FLINK-33217 |
| {{Correlate(LEFT) -> Uncollect}} | *Crashes* — no Project to attach CASTs to |
| {{Correlate(LEFT) -> Filter(Uncollect)}} | *Crashes* — same reason |
| {{Correlate(LEFT) -> Filter(Project(Uncollect))}} | Coincidentally fine on 
master, but documented in earlier investigation as a gap |

h2. Fix

Remove the LogicalType round-trip and use Calcite's {{Uncollect.getRowType()}} 
directly as the TFS rowType. The new {{Correlate}}'s rowType then matches the 
original byte-for-byte regardless of the rule's intermediate shape.

This also lets the dead {{getLogicalProjectWithAdjustedNullability}} helper and 
its dependent imports be removed.

h3. Field naming

Calcite's Uncollect derives names from the source array, which means plan 
output for UNNEST changes:

* {{ARRAY<T>}} / {{MULTISET<T>}}: the unnested column is named after the source 
array column (e.g., {{tags0}} instead of synthetic {{f0}} or {{EXPR$0}}).
* {{MAP<K,V>}}: key/value columns are named {{KEY}} and {{VALUE}} instead of 
{{f0}}/{{f1}}.
* {{WITH ORDINALITY}}: the ordinality column is {{ORDINALITY}} (unchanged).

Multiple unnests in the same query are auto-disambiguated by Calcite's outer 
Correlate (e.g. two MAP unnests produce {{KEY, VALUE, KEY0, VALUE0}}).

The {{INTERNAL_UNNEST_ROWS}} runtime function is positional, so persisted 
CompiledPlans continue to restore correctly. Verified by 
{{CorrelateRestoreTest}}.

h2. Plan-fixture impact

Recorded {{verifyRelPlan}} fixtures change uniformly to reflect the new naming:

* {{flink-table/flink-table-planner/src/test/resources/.../UnnestTest.xml}} 
(batch + stream)
* 
{{flink-table/flink-table-planner/src/test/resources/.../LogicalUnnestRuleTest.xml}}
* {{flink-table/flink-table-planner/src/test/resources/.../MultiJoinTest.xml}} 
(one entry)
* 
{{flink-table/flink-table-planner/src/test/resources/.../JavaCatalogTableTest.xml}}
 (one entry)

h2. Test coverage

Two reproducers added to {{UnnestTestBase}} (commit 1, intentionally failing 
without the fix):
* {{testNullMismatchLeftJoinNoAliasList}}: bare {{Uncollect}} under LEFT 
correlate
* {{testNullMismatchLeftJoinOnPredicate}}: {{Filter(Uncollect)}} under LEFT 
correlate

Generated-by: Claude Code




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to