[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807486#comment-17807486
 ] 

Zhenqiu Huang commented on FLINK-31275:
---------------------------------------

Hi Everyone, I also want to bump the thread due to the internal needs. I feel 
open lineage community gives very good suggestion to define an intermediate 
representation (Dataset) about the metadata of a since/sink. Also LineageVertex 
could definitely have multiple dataset, for example Hybrid source users who 
read from Kafka first then switch to iceberg. Given this, I feel the config 
should be in dataset rather than LineageVertex. On the other hand, we want to 
make the column lineage possible, so having the query in the dataset will be 
reason for lineage provide to analysis the column relationship. For 
input/output schema, we may put it into a facet. It could be optional depends 
on the connector implementation. How do you think [~zjureel] [~mobuchowski]?


public interface LineageVertex {
    /* List of input (for source) or output (for sink) datasets interacted with 
by the connector */
    List<Dataset> datasets; 
} 

public interface Dataset {
    /* Name for this particular dataset. */
    String name;
    /* Unique name for this dataset's datasource. */
    String namespace; 
    /* Query used to generate the dataset If there is */
    String query;
    /* Facets for the lineage vertex to describe the particular information of 
dataset. */ 
    Map<FacetType, Facet> facets; 
} 

Facet type could be SchemaFacet and ConfigFacet. 



> Flink supports reporting and storage of source/sink tables relationship
> -----------------------------------------------------------------------
>
>                 Key: FLINK-31275
>                 URL: https://issues.apache.org/jira/browse/FLINK-31275
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>    Affects Versions: 1.18.0
>            Reporter: Fang Yong
>            Assignee: Fang Yong
>            Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to