Huang Xingbo created FLINK-25551:
------------------------------------

             Summary: Add example and documentation on the usage of Row in 
Python UDTF
                 Key: FLINK-25551
                 URL: https://issues.apache.org/jira/browse/FLINK-25551
             Project: Flink
          Issue Type: Improvement
          Components: API / Python, Documentation
    Affects Versions: 1.14.2, 1.13.5, 1.15.0
            Reporter: Huang Xingbo


The following example comes from pyflink users:

{code:python}
        source_table = """
            CREATE TABLE source_table (
              a ARRAY<ROW<PRODID STRING, ADDMONEY DOUBLE>>,
              b ARRAY<DOUBLE>
            ) WITH (
                'connector' = 'datagen',
                'number-of-rows' = '10'
            )
        """
        @udtf(result_types=[DataTypes.STRING(), DataTypes.DOUBLE()])
        def split(x: list):
            for s in x:
                yield s

         @udtf(result_types=[
                                 DataTypes.ROW([
                                          DataTypes.FIELD("PRODID", 
DataTypes.STRING()),
                                          DataTypes.FIELD("ADDMONEY", 
DataTypes.DOUBLE())])])
        def split2(x: list):
            for s in x:
                yield s,  # NOTE: This ',' is important

        t_env.execute_sql(source_table)
        # If you want to split the Row into two columns
        t_env.from_path("source_table").join_lateral(split(col('a')).alias('x', 
'y')).select("x, y")
        # If you want to treat the entire row as a column
        
t_env.from_path("source_table").join_lateral(split2(col('a')).alias('x')).select("x")

{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to