[ 
https://issues.apache.org/jira/browse/FLINK-16666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071640#comment-17071640
 ] 

Wei Zhong commented on FLINK-16666:
-----------------------------------

Hi [~aljoscha],

Yes we don't really support executing Python UDFs in `flink-core`, `flink-java` 
and `flink-streaming-java`. The code we added there is only used to define and 
process the Python configurations. 

First, there are many modules in our code base that support python, e.g. `SQL 
DDL`, `flink-table-planner`, `flink-table-planner-blink`, `flink-client`, 
`flink-sql-client`, etc. For simplicity let's call the modules that support 
Python "python-related modules". The amount of python-related modules will 
increase in the future, e.g. `flink-streaming-java`(for PyFlink DataStream API) 
and `flink-container`(for k8s support). 

The need for Python dependency management is widespread in any python-related 
modules. To unify the interface of Python dependency management and decouple 
the python-related modules from Python dependency management, we intend to use 
configurations to store the Python dependency information. The configurations 
of the information will be stored in the `Configuration` object of the 
`ExecutionEnvironment/StreamExecutionEnvironment`. After entering the code of 
the flink-python module, these configurations will be used to build the Python 
environment.

Because any python-related modules need to read the definition of Python 
ConfigOptions, we put the definition of Python ConfigOptions (i.e. 
`PythonOptions` class) in `flink-core`, just like other config options. The 
python-related modules also need to process these configurations (i.e. register 
files to the distributed cache). For code reuse we process them in the 
`configure()` method of `ExecutionEnvironment/StreamExecutionEnvironment`. We 
can also do this via repeating the logic in each python-related module, or 
putting the logic in `flink-python` and calling via reflection when needed, but 
both of them seem not very clean.

> Support new Python dependency configuration options in flink-java, 
> flink-streaming-java and flink-table
> -------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-16666
>                 URL: https://issues.apache.org/jira/browse/FLINK-16666
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / Python
>            Reporter: Wei Zhong
>            Assignee: Wei Zhong
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to