[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779441#comment-17779441 ] Jane Chan commented on FLINK-32296: --- Encountered the same problem, and the fix worked! Thanks [~Sergey Nuyanzin] (y) > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.16.3, 1.17.2, 1.19.0 > > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758396#comment-17758396 ] Sergey Nuyanzin commented on FLINK-32296: - 1.18: [8715f0a1e8c3bb6595d26c69ef4ef246486d187d|https://github.com/apache/flink/commit/8715f0a1e8c3bb6595d26c69ef4ef246486d187d] > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > Fix For: 1.16.3, 1.17.2, 1.19.0 > > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758390#comment-17758390 ] Sergey Nuyanzin commented on FLINK-32296: - [~renqs] agree and also think so in fact the PR [1] was already present for 1.18 branch however yesterday it was still in ci processing will merge it today [1] https://github.com/apache/flink/pull/23273 > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > Fix For: 1.16.3, 1.17.2, 1.19.0 > > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758323#comment-17758323 ] Yunhong Zheng commented on FLINK-32296: --- Hi, [~Sergey Nuyanzin]. Sorry for I'm not checking whether there are same issue in jira at the beginning, because initially I thought it was a bug in Kafka-connector. This is indeed a duplicate issue, it can be closed. Thanks for your contribution! > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > Fix For: 1.16.3, 1.17.2, 1.19.0 > > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758306#comment-17758306 ] Qingsheng Ren commented on FLINK-32296: --- [~Sergey Nuyanzin] I think we also need to cherry-pick the patch to the release-1.18 branch > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > Fix For: 1.16.3, 1.17.2, 1.19.0 > > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758194#comment-17758194 ] Sergey Nuyanzin commented on FLINK-32296: - Merged to master as [6d62f9918ea2cbb8a10c705a25a4ff6deab60711|https://github.com/apache/flink/commit/6d62f9918ea2cbb8a10c705a25a4ff6deab60711] > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752067#comment-17752067 ] Christian Lorenz commented on FLINK-32296: -- Hi [~qingwei91] and [~Sergey Nuyanzin], we also seem to hit this issue using flink 1.17.1. I was able to reproduce it and also fix it with the pr changes proposed by [~Sergey Nuyanzin]. Is there a chance that this fix will be added to 1.17.2? > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743720#comment-17743720 ] Lim Qing Wei commented on FLINK-32296: -- Hi [~Sergey Nuyanzin] , sorry for the long response time. I finally got around to test it, and I can confirm it fixes the issue in my test case. > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17741256#comment-17741256 ] Sergey Nuyanzin commented on FLINK-32296: - [~qingwei91] this error means that you have some other files without license in header. Are you sure you don't have other changes which are not part of this PR? > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17741252#comment-17741252 ] Lim Qing Wei commented on FLINK-32296: -- Hi [~Sergey Nuyanzin] , I am not able to build your branch locally. Is there any alternative to test it? > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739253#comment-17739253 ] Sergey Nuyanzin commented on FLINK-32296: - [~qingwei91] could you please double check that this fix fixes the problem? > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Priority: Major > Labels: pull-request-available > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739252#comment-17739252 ] Sergey Nuyanzin commented on FLINK-32296: - The root cause is {{RowToRowCastRule}}. Since it was introduced in 1.15.0 at FLINK-25052 it could work for 1.14.x. During code gen it generates something like {code:java} ... for (int i$13 = 0; i$13 < array$7.size(); i$13++) { ... writer$17.reset(); ... result$15 = row$16; ... objArray$12[i$13] = result$15; ... } ... {code} where {{result$15}} - item of array and in case of row it is passed by reference, and then overridden by other values in next iterations. Finally every element of array references to the latest source array element. Thus if we look at example and especially at last element of every array from {{bug_data}} in description there are only two different elements. That explains why it gives currently 2 elements instead of 5. Same problem is for maps with size more than 1 where key or value is row > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2, 1.17.1 >Reporter: Lim Qing Wei >Priority: Major > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In all 1.15. 1.16 and 1.17 version I've tested, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly
[ https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731149#comment-17731149 ] Benchao Li commented on FLINK-32296: [~qingwei91] Thanks for reporting the issue, have you tried on current master branch? > Flink SQL handle array of row incorrectly > - > > Key: FLINK-32296 > URL: https://issues.apache.org/jira/browse/FLINK-32296 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.15.3, 1.16.2 >Reporter: Lim Qing Wei >Priority: Major > > FlinkSQL produce incorrect result when involving data with type of > ARRAY, here's a reproduction: > > > {code:java} > CREATE TEMPORARY VIEW bug_data as ( > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, > '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION > SELECT CAST(ARRAY[ > (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), > (2210, '3-01-10'), (4410, '23243243') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (24367, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), > (2210, '20-01-10'), (4410, '2') > ] AS ARRAY>) > UNION SELECT CAST(ARRAY[ > (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), > (8967564, '20-01-10'), (4410, '2') > ] AS ARRAY>) > ); > CREATE TABLE sink ( > r ARRAY> > ) WITH ('connector' = 'print'); {code} > > > In both 1.15 and 1.16, it produces the following: > > {noformat} > [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, > 2], +I[4410, 2]] > [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, > 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} > > > I think this is unexpected/wrong because: > # The query should produce 5 rows, not 2 > # The data is also wrong, noticed it just make every row in the array the > same, but the input are not the same. > -- This message was sent by Atlassian Jira (v8.20.10#820010)