[
https://issues.apache.org/jira/browse/FLINK-38468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jie Cheng updated FLINK-38468:
------------------------------
Description:
I have two flink tasks, using 1.12 and 1.18 respectively. flinksql contains the
COALESCE function. I found that the performance of 1.18 is not as good as that
of 1.12
Use the sql:
select COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple, '$.level_type'),
'0') as level_type
1.12 Execution Plan
== Abstract Syntax Tree ==
LogicalProject(level_type=[CASE(IS NOT NULL($1), CAST($1):VARCHAR(2147483647)
CHARACTER SET "UTF-16LE" NOT NULL, IS NOT NULL(GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type')), CAST(GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type')):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT
NULL, _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")])
+- LogicalTableScan(table=[[default_catalog, default_database, shadow_source]])
== Optimized Logical Plan ==
Calc(select=[CASE(IS NOT NULL(level_type), CAST(level_type), IS NOT
NULL(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
CAST(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
_UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
== Physical Execution Plan ==
Stage 1 : Data Source
content : Source: TableSourceScan(table=[[default_catalog, default_database,
shadow_source]], fields=[order_n_tuple, level_type])
Stage 2 : Operator
content : Calc(select=[(level_type IS NOT NULL CASE CAST(level_type) CASE
(order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type') IS NOT NULL CASE
CAST((order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type')) CASE
_UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
ship_strategy : FORWARD
1.18 Execution Plan
== Abstract Syntax Tree ==
LogicalProject(level_type=[COALESCE($1, GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type'), _UTF-16LE'0')])
+- LogicalTableScan(table=[[default_catalog, default_database, shadow_source]])
== Optimized Physical Plan ==
Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
_UTF-16LE'$.level_type'), _UTF-16LE'0') AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
== Optimized Execution Plan ==
Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
'$.level_type'), '0') AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
was:
I have two flink tasks, using 1.12 and 1.18 respectively. flinksql contains the
COALESCE function. I found that the performance of 1.18 is not as good as that
of 1.12
Use the sql:
select COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple, '$.level_type'),
'0') as level_type
1.12 Execution Plan
== Abstract Syntax Tree ==
LogicalProject(level_type=[CASE(IS NOT NULL($1), CAST($1):VARCHAR(2147483647)
CHARACTER SET "UTF-16LE" NOT NULL, IS NOT NULL(GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type')), CAST(GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type')):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT
NULL, _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")])
+- LogicalTableScan(table=[[default_catalog, default_database, shadow_source]])
== Optimized Logical Plan ==
Calc(select=[CASE(IS NOT NULL(level_type), CAST(level_type), IS NOT
NULL(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
CAST(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
_UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
== Physical Execution Plan ==
Stage 1 : Data Source
content : Source: TableSourceScan(table=[[default_catalog, default_database,
shadow_source]], fields=[order_n_tuple, level_type])
Stage 2 : Operator
content : Calc(select=[(level_type IS NOT NULL CASE CAST(level_type) CASE
(order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type') IS NOT NULL CASE
CAST((order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type')) CASE
_UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
ship_strategy : FORWARD
1.18 Execution Plan
== Abstract Syntax Tree ==
LogicalProject(level_type=[COALESCE($1, GET_JSON_OBJECT($0,
_UTF-16LE'$.level_type'), _UTF-16LE'0')])
+- LogicalTableScan(table=[[default_catalog, default_database, shadow_source]])
== Optimized Physical Plan ==
Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
_UTF-16LE'$.level_type'), _UTF-16LE'0') AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
== Optimized Execution Plan ==
Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
'$.level_type'), '0') AS level_type])
+- TableSourceScan(table=[[default_catalog, default_database, shadow_source]],
fields=[order_n_tuple, level_type])
> Performance drops when the coalesce parameter includes functions
> ----------------------------------------------------------------
>
> Key: FLINK-38468
> URL: https://issues.apache.org/jira/browse/FLINK-38468
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.18.0
> Reporter: Jie Cheng
> Priority: Major
> Fix For: 2.2.0
>
>
> I have two flink tasks, using 1.12 and 1.18 respectively. flinksql contains
> the COALESCE function. I found that the performance of 1.18 is not as good as
> that of 1.12
>
> Use the sql:
> select COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple, '$.level_type'),
> '0') as level_type
>
> 1.12 Execution Plan
> == Abstract Syntax Tree ==
> LogicalProject(level_type=[CASE(IS NOT NULL($1), CAST($1):VARCHAR(2147483647)
> CHARACTER SET "UTF-16LE" NOT NULL, IS NOT NULL(GET_JSON_OBJECT($0,
> _UTF-16LE'$.level_type')), CAST(GET_JSON_OBJECT($0,
> _UTF-16LE'$.level_type')):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT
> NULL, _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")])
> +- LogicalTableScan(table=[[default_catalog, default_database,
> shadow_source]])
>
> == Optimized Logical Plan ==
> Calc(select=[CASE(IS NOT NULL(level_type), CAST(level_type), IS NOT
> NULL(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
> CAST(GET_JSON_OBJECT(order_n_tuple, _UTF-16LE'$.level_type')),
> _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
> +- TableSourceScan(table=[[default_catalog, default_database,
> shadow_source]], fields=[order_n_tuple, level_type])
>
> == Physical Execution Plan ==
> Stage 1 : Data Source
> content : Source: TableSourceScan(table=[[default_catalog, default_database,
> shadow_source]], fields=[order_n_tuple, level_type])
>
> Stage 2 : Operator
> content : Calc(select=[(level_type IS NOT NULL CASE CAST(level_type) CASE
> (order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type') IS NOT NULL CASE
> CAST((order_n_tuple GET_JSON_OBJECT _UTF-16LE'$.level_type')) CASE
> _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AS level_type])
> ship_strategy : FORWARD
>
>
>
> 1.18 Execution Plan
> == Abstract Syntax Tree ==
> LogicalProject(level_type=[COALESCE($1, GET_JSON_OBJECT($0,
> _UTF-16LE'$.level_type'), _UTF-16LE'0')])
> +- LogicalTableScan(table=[[default_catalog, default_database,
> shadow_source]])
> == Optimized Physical Plan ==
> Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
> _UTF-16LE'$.level_type'), _UTF-16LE'0') AS level_type])
> +- TableSourceScan(table=[[default_catalog, default_database,
> shadow_source]], fields=[order_n_tuple, level_type])
> == Optimized Execution Plan ==
> Calc(select=[COALESCE(level_type, GET_JSON_OBJECT(order_n_tuple,
> '$.level_type'), '0') AS level_type])
> +- TableSourceScan(table=[[default_catalog, default_database,
> shadow_source]], fields=[order_n_tuple, level_type])
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)