Liu Liu created FLINK-39531:
-------------------------------
Summary: ScalarFunctionSplitter extracts the same Python/Async UDF
call multiple times
Key: FLINK-39531
URL: https://issues.apache.org/jira/browse/FLINK-39531
Project: Flink
Issue Type: Bug
Components: Table SQL / Planner
Reporter: Liu Liu
Currently, ScalarFunctionSplitter bookkeeps extracted RexNodes to deduplicate
identical remote functions during calc splitting, but the bookkeeping is keyed
on the RexInputRef rather than the original RexNode. Therefore, when the same
Python or Async UDF call appears in the input of multiple projections of one
SELECT, each occurrence is extracted once and the UDF is invoked every time.
*Example*
{{SELECT pyFunc1(a, c) + 1, pyFunc1(a, c) + 2 FROM MyTable}}
Current plan:
{{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f1, 2) AS EXPR$1])}}
{{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0, pyFunc1(a, c) AS f1])}}
{{ +- TableSourceScan(...)}}
Expected plan:
{{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f0, 2) AS EXPR$1])}}
{{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0])}}
{{ +- TableSourceScan(...)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)