dianfu commented on a change in pull request #11448:
[FLINK-16666][python][table] Support new Python dependency configuration
options in flink-java, flink-streaming-java and flink-table.
URL: https://github.com/apache/flink/pull/11448#discussion_r396238669
##########
File path:
flink-core/src/main/java/org/apache/flink/configuration/PythonOptions.java
##########
@@ -81,4 +79,60 @@
"buffer of a Python worker. The memory will be
accounted as managed memory if the " +
"actual memory allocated to an operator is no less than
the total memory of a Python " +
"worker. Otherwise, this configuration takes no
effect.");
+
+ public static final ConfigOption<String> PYTHON_FILES = ConfigOptions
+ .key("python.files")
+ .stringType()
+ .noDefaultValue()
+ .withDescription("Attach custom python files for job. These
files will " +
+ "be added to the PYTHONPATH of both the local client
and the remote python UDF " +
+ "worker. The standard python resource file suffixes
such as .py/.egg/.zip or " +
+ "directory are all supported. Comma (',') could be used
as the separator to specify " +
+ "multiple files. The option is equivalent to the
command line option \"-pyfs\". ");
+
+ public static final ConfigOption<String> PYTHON_REQUIREMENTS =
ConfigOptions
+ .key("python.requirements")
+ .stringType()
+ .noDefaultValue()
+ .withDescription("Specify a requirements.txt file which defines
the third-party " +
+ "dependencies. These dependencies will be installed and
added to the PYTHONPATH of " +
+ "the python UDF worker. A directory which contains the
installation packages of " +
+ "these dependencies could be specified optionally. Use
'#' as the separator if the " +
+ "optional parameter exists. The option is equivalent to
the command line option " +
+ "\"-pyreq\".");
+
+ public static final ConfigOption<String> PYTHON_ARCHIVES = ConfigOptions
+ .key("python.archives")
+ .stringType()
+ .noDefaultValue()
+ .withDescription("Add python archive files for job. The archive
files will be extracted " +
+ "to the working directory of python UDF worker.
Currently only zip-format is " +
+ "supported. For each archive file, a target directory
is specified. If the target " +
+ "directory name is specified, the archive file will be
extracted to a name can " +
+ "directory with the specified name. Otherwise, the
archive file will be extracted to " +
+ "a directory with the same name of the archive file.
The files uploaded via this " +
+ "option are accessible via relative path. '#' could be
used as the separator of the " +
+ "archive file path and the target directory name. Comma
(',') could be used as the " +
+ "separator to specify multiple archive files. This
option can be used to upload the " +
+ "virtual environment, the data files used in Python
UDF. The data files could be " +
+ "accessed in Python UDF, e.g.: f =
open('data/data.txt', 'r'). The option is " +
+ "equivalent to the command line option \"-pyarch\".");
+
+ public static final ConfigOption<String> PYTHON_EXECUTABLE =
ConfigOptions
+ .key("python.executable")
+ .stringType()
+ .noDefaultValue()
+ .withDescription("Specify the path of the python interpreter
used to execute the python " +
+ "UDF worker. The python UDF worker depends on Python
3.5+, Apache Beam " +
+ "(version == 2.19.0), Pip (version >= 7.1.0) and
SetupTools (version >= 37.0.0). " +
+ "Please ensure that the specified environment meets the
above requirements. The " +
+ "option is equivalent to the command line option
\"-pyexec\".");
+
+ public static final ConfigOption<String> PYTHON_CLIENT_EXECUTABLE =
ConfigOptions
+ .key("python.client.executable")
+ .defaultValue("python")
+ .withDescription("The python interpreter used to launch the
python process when compiling " +
+ "the jobs containing Python UDFs. Equivalent to the
environment variable PYFLINK_EXECUTABLE. " +
+ "The precedence is: 1. configuration in job source
code. 2. environment variable. " +
Review comment:
Do you mean the priority? If so, suggest to change `precedence` to
`priority` to make it more clear:
`The priority is as following: 1. the configuration
'python.client.executable' defined in the source code; 2. the environment
variable PYFLINK_EXECUTABLE; 3. the configuration 'python.client.executable'
defined in flink-conf.yaml`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services