[ 
https://issues.apache.org/jira/browse/FLINK-33217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773263#comment-17773263
 ] 

Sergey Nuyanzin commented on FLINK-33217:
-----------------------------------------

Hi [~rmetzger]
thanks for raising the issue

It looks like the problem consists of several parts
1. there is an error in SQL provided in description
   it complains about type mismatch (in trace) like 
{noformat}
RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL 
name) NOT NULL ARRAY updateEvent) business_data, VARCHAR(2147483647) CHARACTER 
SET "UTF-16LE" ue_name) NOT NULL
rowtype of rel after registration: 
RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL
{noformat}

the first discrepancy is naimng: in one column it is {{ue_name}} while in 
another it is {{name}}, so the more correct SQL should look like 
{code:sql}
select id, name from reproduce_unnest LEFT JOIN 
UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(name) ON true;
{code}

However this will also fail with 
{noformat}
set type is RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER 
SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" name) NOT NULL
expression type is RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) 
business_data, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT 
NULL
set is rel#104:LogicalCorrelate.NONE.any.None: 
0.[NONE].[NONE](left=HepRelVertex#98,right=HepRelVertex#103,correlation=$cor0,joinType=left,requiredColumns={0})
expression is LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{0}])
  LogicalTableScan(table=[[default_catalog, default_database, reproduce_unnest, 
source: [CollectionTableSource(business_data)]]])
  
LogicalTableFunctionScan(invocation=[$UNNEST_ROWS$1($cor0.business_data.updateEvent)],
 rowType=[RecordType:peek_no_expand(VARCHAR(2147483647) name)])

{noformat}

Now if look at type comparison in trace
{noformat}
RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" name) NOT NULL
RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL
{noformat}
we can see that the difference is that for the first type column {{name}} is 
nullable and for the second is not

the WA is to ensure that there is the same type on both sides of the join e.g. 
nullable for both
like
{code:sql}
CREATE TABLE reproduce_unnest (
 business_data ROW<`id` STRING, `updateEvent` ARRAY<ROW<`name` STRING> NOT 
NULL>>
) WITH (
 'connector' = 'COLLECTION',
 'is-bounded' = 'false'
)
{code}
and
{code:sql}
select id, name from reproduce_unnest LEFT JOIN 
UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(name) ON true
{code}

> Flink SQL: UNNEST fails with on LEFT JOIN with NOT NULL type in array
> ---------------------------------------------------------------------
>
>                 Key: FLINK-33217
>                 URL: https://issues.apache.org/jira/browse/FLINK-33217
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.15.3, 1.18.0, 1.19.0
>            Reporter: Robert Metzger
>            Priority: Major
>         Attachments: UnnestNullErrorTest.scala
>
>
> Steps to reproduce:
> Take a column of type 
> {code:java}
> business_data ROW<`id` STRING, `updateEvent` ARRAY<ROW<`name` STRING NOT 
> NULL> NOT NULL>> {code}
> Take this query
> {code:java}
> select id, ue_name from reproduce_unnest LEFT JOIN 
> UNNEST(reproduce_unnest.business_data.updateEvent) AS exploded_ue(ue_name) ON 
> true {code}
> And get this error
> {code:java}
> Caused by: java.lang.AssertionError: Type mismatch:
> rowtype of rel before registration: 
> RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" ue_name) NOT NULL
> rowtype of rel after registration: 
> RecordType(RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" id, RecordType:peek_no_expand(VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" NOT NULL name) NOT NULL ARRAY updateEvent) business_data, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL name) NOT NULL
> Difference:
> ue_name: VARCHAR(2147483647) CHARACTER SET "UTF-16LE" -> VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE" NOT NULL
>       at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32)
>       at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:2206)
>       at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:275)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1270)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:598)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:613)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.changeTraits(VolcanoPlanner.java:498)
>       at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:315)
>       at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:62)
>       ... 66 more
> {code}
> I have implemented a small test case, which fails against Flink 1.15, 1.8 and 
> the latest master branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to