yuxiqian opened a new pull request, #4395:
URL: https://github.com/apache/flink-cdc/pull/4395

   This PR provides the ability to write inline Python UDFs with YAML pipeline 
jobs. It could be used like this:
   
   ```yaml
   transform:
     - source-table: db.users
       projection: ID, py_normalize(EMAIL) AS EMAIL_NORM, py_double(AGE) AS 
DOUBLED
   
   pipeline:
     user-defined-function:
       - name: py_normalize
         classpath: org.apache.flink.cdc.python.PythonUdf
         options:
           python-executable: /usr/bin/python3
           source: |
             def eval(s: str) -> str:
               return s.strip().lower()
       - name: py_accumulate
         classpath: org.apache.flink.cdc.python.PythonUdf
         options:
           python-executable: /usr/bin/python3
           source: |
             total = 0
             def eval(x: int) -> int:
               global total
               total += x
               return total
   ```
   
   The wrapper itself is implemented as a Java UDF as well. No changes are made 
in the existing framework except the following:
   
   * Runtime UDF binding names are slightly changed to allow defining multiple 
UDFs with the same class.
   * Added an overload function for `UserDefinedFunction#getReturnType` to pass 
extra context info.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to