Liu Liu created FLINK-39531:
-------------------------------

             Summary: ScalarFunctionSplitter extracts the same Python/Async UDF 
call multiple times
                 Key: FLINK-39531
                 URL: https://issues.apache.org/jira/browse/FLINK-39531
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Planner
            Reporter: Liu Liu


Currently, ScalarFunctionSplitter bookkeeps extracted RexNodes to deduplicate 
identical remote functions during calc splitting, but the bookkeeping is keyed 
on the RexInputRef rather than the original RexNode. Therefore, when the same 
Python or Async UDF call appears in the input of multiple projections of one 
SELECT, each occurrence is extracted once and the UDF is invoked every time.

 

*Example*

{{SELECT pyFunc1(a, c) + 1, pyFunc1(a, c) + 2 FROM MyTable}}

Current plan:

{{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f1, 2) AS EXPR$1])}}
{{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0, pyFunc1(a, c) AS f1])}}
{{   +- TableSourceScan(...)}}

Expected plan:

{{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f0, 2) AS EXPR$1])}}
{{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0])}}
{{   +- TableSourceScan(...)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to