yuxiqian opened a new pull request, #4395:
URL: https://github.com/apache/flink-cdc/pull/4395
This PR provides the ability to write inline Python UDFs with YAML pipeline
jobs. It could be used like this:
```yaml
transform:
- source-table: db.users
projection: ID, py_normalize(EMAIL) AS EMAIL_NORM, py_double(AGE) AS
DOUBLED
pipeline:
user-defined-function:
- name: py_normalize
classpath: org.apache.flink.cdc.python.PythonUdf
options:
python-executable: /usr/bin/python3
source: |
def eval(s: str) -> str:
return s.strip().lower()
- name: py_accumulate
classpath: org.apache.flink.cdc.python.PythonUdf
options:
python-executable: /usr/bin/python3
source: |
total = 0
def eval(x: int) -> int:
global total
total += x
return total
```
The wrapper itself is implemented as a Java UDF as well. No changes are made
in the existing framework except the following:
* Runtime UDF binding names are slightly changed to allow defining multiple
UDFs with the same class.
* Added an overload function for `UserDefinedFunction#getReturnType` to pass
extra context info.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]