[ https://issues.apache.org/jira/browse/FLINK-33217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773263#comment-17773263 ]
Sergey Nuyanzin commented on FLINK-33217: ----------------------------------------- Hi [~rmetzger] thanks for raising the issue It looks like the problem consists of several parts 1. there is an error in SQL provided in description it complains about type mismatch (in trace) like {noformat} RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" ue_name) NOT NULL rowtype of rel after registration: RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL {noformat} the first discrepancy is naimng: in one column it is {{ue_name}} while in another it is {{name}}, so the more correct SQL should look like {code:sql} select id, name from reproduce_unnest LEFT JOIN UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(name) ON true; {code} However this will also fail with {noformat} set type is RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" name) NOT NULL expression type is RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL set is rel#104:LogicalCorrelate.NONE.any.None: 0.[NONE].[NONE](left=HepRelVertex#98,right=HepRelVertex#103,correlation=$cor0,joinType=left,requiredColumns={0}) expression is LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0}]) LogicalTableScan(table=[[default_catalog, default_database, reproduce_unnest, source: [CollectionTableSource(business_data)]]]) LogicalTableFunctionScan(invocation=[$UNNEST_ROWS$1($cor0.business_data.updateEvent)], rowType=[RecordType:peek_no_expand(VARCHAR(2147483647) name)]) {noformat} Now if look at type comparison in trace {noformat} RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" name) NOT NULL RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL {noformat} we can see that the difference is that for the first type column {{name}} is nullable and for the second is not the WA is to ensure that there is the same type on both sides of the join e.g. nullable for both like {code:sql} CREATE TABLE reproduce_unnest ( business_data ROW<`id` STRING, `updateEvent` ARRAY<ROW<`name` STRING> NOT NULL>> ) WITH ( 'connector' = 'COLLECTION', 'is-bounded' = 'false' ) {code} and {code:sql} select id, name from reproduce_unnest LEFT JOIN UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(name) ON true {code} > Flink SQL: UNNEST fails with on LEFT JOIN with NOT NULL type in array > --------------------------------------------------------------------- > > Key: FLINK-33217 > URL: https://issues.apache.org/jira/browse/FLINK-33217 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.15.3, 1.18.0, 1.19.0 > Reporter: Robert Metzger > Priority: Major > Attachments: UnnestNullErrorTest.scala > > > Steps to reproduce: > Take a column of type > {code:java} > business_data ROW<`id` STRING, `updateEvent` ARRAY<ROW<`name` STRING NOT > NULL> NOT NULL>> {code} > Take this query > {code:java} > select id, ue_name from reproduce_unnest LEFT JOIN > UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(ue_name) ON > true {code} > And get this error > {code:java} > Caused by: java.lang.AssertionError: Type mismatch: > rowtype of rel before registration: > RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET > "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET > "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" ue_name) NOT NULL > rowtype of rel after registration: > RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET > "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET > "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, > VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL > Difference: > ue_name: VARCHAR(2147483647) CHARACTER SET "UTF-16LE" -> VARCHAR(2147483647) > CHARACTER SET "UTF-16LE" NOT NULL > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32) > at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:2206) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:275) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1270) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:598) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:613) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.changeTraits(VolcanoPlanner.java:498) > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:315) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:62) > ... 66 more > {code} > I have implemented a small test case, which fails against Flink 1.15, 1.8 and > the latest master branch. -- This message was sent by Atlassian Jira (v8.20.10#820010)