[
https://issues.apache.org/jira/browse/FLINK-39531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dian Fu reassigned FLINK-39531:
-------------------------------
Assignee: Liu Liu
> ScalarFunctionSplitter extracts the same Python/Async UDF call multiple times
> -----------------------------------------------------------------------------
>
> Key: FLINK-39531
> URL: https://issues.apache.org/jira/browse/FLINK-39531
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 2.2.0
> Reporter: Liu Liu
> Assignee: Liu Liu
> Priority: Major
>
> Currently, ScalarFunctionSplitter bookkeeps extracted RexNodes to deduplicate
> identical remote functions during calc splitting, but the bookkeeping is
> keyed on the RexInputRef rather than the original RexNode. Therefore, when
> the same Python or Async UDF call appears in the input of multiple
> projections of one SELECT, each occurrence is extracted once and the UDF is
> invoked every time.
>
> *Example*
> {{SELECT pyFunc1(a, c) + 1, pyFunc1(a, c) + 2 FROM MyTable}}
> Current plan:
> {{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f1, 2) AS EXPR$1])}}
> {{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0, pyFunc1(a, c) AS f1])}}
> {{ +- TableSourceScan(...)}}
> Expected plan:
> {{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f0, 2) AS EXPR$1])}}
> {{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0])}}
> {{ +- TableSourceScan(...)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)