[ 
https://issues.apache.org/jira/browse/FLINK-39531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu reassigned FLINK-39531:
-------------------------------

    Assignee: Liu Liu

> ScalarFunctionSplitter extracts the same Python/Async UDF call multiple times
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-39531
>                 URL: https://issues.apache.org/jira/browse/FLINK-39531
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 2.2.0
>            Reporter: Liu Liu
>            Assignee: Liu Liu
>            Priority: Major
>
> Currently, ScalarFunctionSplitter bookkeeps extracted RexNodes to deduplicate 
> identical remote functions during calc splitting, but the bookkeeping is 
> keyed on the RexInputRef rather than the original RexNode. Therefore, when 
> the same Python or Async UDF call appears in the input of multiple 
> projections of one SELECT, each occurrence is extracted once and the UDF is 
> invoked every time.
>  
> *Example*
> {{SELECT pyFunc1(a, c) + 1, pyFunc1(a, c) + 2 FROM MyTable}}
> Current plan:
> {{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f1, 2) AS EXPR$1])}}
> {{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0, pyFunc1(a, c) AS f1])}}
> {{   +- TableSourceScan(...)}}
> Expected plan:
> {{FlinkLogicalCalc(select=[+(f0, 1) AS EXPR$0, +(f0, 2) AS EXPR$1])}}
> {{+- FlinkLogicalCalc(select=[pyFunc1(a, c) AS f0])}}
> {{   +- TableSourceScan(...)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to