[
https://issues.apache.org/jira/browse/FLINK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Leonard Xu updated FLINK-14900:
-------------------------------
Description:
the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as
following:
{code:java}
// simplify query 19
select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5)
from customer_address,store
where substring(ca_zip,1,5) <> substring(s_zip,1,5)
order by ca_zip,s_zip{code}
flink output:
{code:java}
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904{code}
SQL server output:
{code:java}
00601 31904 00601 31904
00601 31904 00601 31904
00601 31904 00601 31904
00601 31904 00601 31904
{code}
This issue may influent the result of tpcds query 46、68.
Another similar pattern "where a = b" in query31, it'll will bring null value
too.
{code:java}
// simplify query 31
select
ss1.ca_county
,ss1.d_year
,ss2.store_sales/ss1.store_sales store_q1_q2_increase
from
ss ss1
,ss ss2
where
ss1.d_qoy = 1
and ss1.d_year = 2000
and ss1.ca_county = ss2.ca_county
and ss2.d_qoy = 2
and ss2.d_year = 2000
order by ss1.ca_county{code}
Flink output :
{code:java}
|2000|0.845635
Acadia Parish|2000|1.362160
Accomack County|2000|0.650251
{code}
SQL server output:
{code:java}
Acadia Parish|2000|1.362159
Accomack County|2000|0.650250
{code}
I tested expression test and sql ITcase,they will return correct value.
{code:java}
@Test
def testSubStr(): Unit = {
checkQuery(
Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601",
50d)),
"select f0, substring(f0,1,5) from Table1 where substring(f0,1,5) <> f1 ",
Seq(("00501", "00501"))
)
}
{code}
So, I need to dig more about this issue.
was:
the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as
following:
{code:java}
// simplify query
select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5)
from customer_address,store
where substring(ca_zip,1,5) <> substring(s_zip,1,5)
order by ca_zip,s_zip{code}
flink output:
{code:java}
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904{code}
SQL server output:
{code:java}
00601 31904 00601 31904
00601 31904 00601 31904
00601 31904 00601 31904
00601 31904 00601 31904
{code}
And this issue may influent the result of tpcds query 46、68.
I tested expression test and sql ITcase,they will return correct value.
{code:java}
@Test
def testSubStr(): Unit = {
checkQuery(
Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601",
50d)),
"select f0, substring(f0,1,5) from Table1 where substring(f0,1,5) <> f1 ",
Seq(("00501", "00501"))
)
}
{code}
So, I need to dig more about this issue.
> output contains null value when a is null in pattern where a <> 'String'
> ---------------------------------------------------------------------------
>
> Key: FLINK-14900
> URL: https://issues.apache.org/jira/browse/FLINK-14900
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.9.1
> Reporter: Leonard Xu
> Priority: Major
> Fix For: 1.10.0
>
>
> the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as
> following:
> {code:java}
> // simplify query 19
> select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5)
> from customer_address,store
> where substring(ca_zip,1,5) <> substring(s_zip,1,5)
> order by ca_zip,s_zip{code}
>
> flink output:
> {code:java}
> null|31904|null|31904
> null|31904|null|31904
> null|31904|null|31904
> null|31904|null|31904{code}
> SQL server output:
> {code:java}
> 00601 31904 00601 31904
> 00601 31904 00601 31904
> 00601 31904 00601 31904
> 00601 31904 00601 31904
> {code}
> This issue may influent the result of tpcds query 46、68.
> Another similar pattern "where a = b" in query31, it'll will bring null
> value too.
>
> {code:java}
> // simplify query 31
> select
> ss1.ca_county
> ,ss1.d_year
> ,ss2.store_sales/ss1.store_sales store_q1_q2_increase
> from
> ss ss1
> ,ss ss2
> where
> ss1.d_qoy = 1
> and ss1.d_year = 2000
> and ss1.ca_county = ss2.ca_county
> and ss2.d_qoy = 2
> and ss2.d_year = 2000
> order by ss1.ca_county{code}
>
> Flink output :
>
> {code:java}
> |2000|0.845635
> Acadia Parish|2000|1.362160
> Accomack County|2000|0.650251
> {code}
>
> SQL server output:
>
> {code:java}
> Acadia Parish|2000|1.362159
> Accomack County|2000|0.650250
> {code}
>
>
> I tested expression test and sql ITcase,they will return correct value.
> {code:java}
> @Test
> def testSubStr(): Unit = {
> checkQuery(
> Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601",
> 50d)),
> "select f0, substring(f0,1,5) from Table1 where substring(f0,1,5) <> f1
> ",
> Seq(("00501", "00501"))
> )
> }
> {code}
> So, I need to dig more about this issue.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)