Dian Fu created FLINK-15973: ------------------------------- Summary: Optimize the execution plan where it refers the Python UDF result field in the where clause Key: FLINK-15973 URL: https://issues.apache.org/jira/browse/FLINK-15973 Project: Flink Issue Type: Improvement Components: API / Python Reporter: Dian Fu Fix For: 1.11.0
For the following job: {code} t_env.register_function("inc", inc) table.select("inc(id) as inc_id") \ .where("inc_id > 0") \ .insert_into("sink") {code} The execution plan is as following: {code} StreamExecPythonCalc(select=inc(f0) AS inc_id)) +- StreamExecCalc(select=id AS f0, where=>(f0, 0)) +--- StreamExecPythonCalc(select=id, inc(f0) AS f0)) +-----StreamExecCalc(select=id, id AS f0)) +-------StreamExecTableSourceScan(fields=id) {code} The plan is not the best. It should be as following: {code} StreamExecPythonCalc(select=f0) +- StreamExecCalc(select=f0, where=>(f0, 0)) +--- StreamExecPythonCalc(select=inc(f0) AS f0)) +-----StreamExecCalc(select=id, id AS f0)) +-------StreamExecTableSourceScan(fields=id) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)