Hi Martijn, If there're no more comments, I will start a vote for this, thanks
Best, Fang Yong On Tue, Feb 20, 2024 at 4:53 PM Yong Fang <zjur...@gmail.com> wrote: > Hi Martijn, > > Thank you for your attention. Let me first explain the specific situation > of FLIP-314. FLIP-314 is currently in an accepted state, but actual code > development has not yet begun, and interface related PR has not been merged > into the master. So it may not be necessary for us to create a separate > FLIP. Currently, my idea is to directly update the interface on FLIP-314, > but to initiate a separate thread with the context and we can vote there. > > What do you think? Thanks > > Best, > Fang Yong > > On Mon, Feb 19, 2024 at 8:27 PM Martijn Visser <martijnvis...@apache.org> > wrote: > >> I'm a bit confused: did we add new interfaces after FLIP-314 was >> accepted? If so, please move the new interfaces to a new FLIP and >> start a separate vote. We can't retrospectively change an accepted >> FLIP with new interfaces and a new vote. >> >> On Mon, Feb 19, 2024 at 3:22 AM Yong Fang <zjur...@gmail.com> wrote: >> > >> > Hi all, >> > >> > If there are no more feedbacks, I will start a vote for the new >> interfaces >> > in the next day, thanks >> > >> > Best, >> > Fang Yong >> > >> > On Thu, Feb 8, 2024 at 1:30 PM Yong Fang <zjur...@gmail.com> wrote: >> > >> > > Hi devs, >> > > >> > > According to the online-discussion in FLINK-3127 [1] and >> > > offline-discussion with Maciej Obuchowski and Zhenqiu Huang, we would >> like >> > > to update the lineage vertex relevant interfaces in FLIP-314 [2] as >> follows: >> > > >> > > 1. Introduce `LineageDataset` which represents source and sink in >> > > `LineageVertex`. The fields in `LineageDataset` are as follows: >> > > /* Name for this particular dataset. */ >> > > String name; >> > > /* Unique name for this dataset's storage, for example, url for >> jdbc >> > > connector and location for lakehouse connector. */ >> > > String namespace; >> > > /* Facets for the lineage vertex to describe the particular >> > > information of dataset, such as schema and config. */ >> > > Map<String, Facet> facets; >> > > >> > > 2. There may be multiple datasets in one `LineageVertex`, for example, >> > > kafka source or hybrid source. So users can get dataset list from >> > > `LineageVertex`: >> > > /** Get datasets from the lineage vertex. */ >> > > List<LineageDataset> datasets(); >> > > >> > > 3. There will be built in facets for config and schema. To describe >> > > columns in table/sql jobs and datastream jobs, we introduce >> > > `DatasetSchemaField`. >> > > /** Builtin config facet for dataset. */ >> > > @PublicEvolving >> > > public interface DatasetConfigFacet extends LineageDatasetFacet { >> > > Map<String, String> config(); >> > > } >> > > >> > > /** Field for schema in dataset. */ >> > > public interface DatasetSchemaField<T> { >> > > /** The name of the field. */ >> > > String name(); >> > > /** The type of the field. */ >> > > T type(); >> > > } >> > > >> > > Thanks for valuable inputs from @Maciej and @Zhenqiu. And looking >> forward >> > > to your feedback, thanks >> > > >> > > Best, >> > > Fang Yong >> > > >> > > On Mon, Sep 25, 2023 at 1:18 PM Shammon FY <zjur...@gmail.com> wrote: >> > > >> > >> Hi David, >> > >> >> > >> Do you want the detailed topology for Flink job? You can get >> > >> `JobDetailsInfo` in `RestCusterClient` with the submitted job id, it >> has >> > >> `String jsonPlan`. You can parse the json plan to get all steps and >> > >> relations between them in a Flink job. Hope this can help you, >> thanks! >> > >> >> > >> Best, >> > >> Shammon FY >> > >> >> > >> On Tue, Sep 19, 2023 at 11:46 PM David Radley < >> david_rad...@uk.ibm.com> >> > >> wrote: >> > >> >> > >>> Hi there, >> > >>> I am looking at the interfaces. If I am reading it correctly,there >> is >> > >>> one relationship between the source and sink and this relationship >> > >>> represents the operational lineage. Lineage is usually represented >> as asset >> > >>> -> process - > asset – see for example >> > >>> >> https://egeria-project.org/features/lineage-management/overview/#the-lineage-graph >> > >>> >> > >>> Maybe I am missing it, but it seems to be that it would be useful to >> > >>> store the process in the lineage graph. >> > >>> >> > >>> It is useful to have the top level lineage as source -> Flink job -> >> > >>> sink. Where the Flink job is the process, but also to have this >> asset -> >> > >>> process -> asset pattern for each of the steps in the job. If this >> is >> > >>> present, please could you point me to it, >> > >>> >> > >>> Kind regards, David. >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> From: David Radley <david_rad...@uk.ibm.com> >> > >>> Date: Tuesday, 19 September 2023 at 16:11 >> > >>> To: dev@flink.apache.org <dev@flink.apache.org> >> > >>> Subject: [EXTERNAL] RE: [DISCUSS] FLIP-314: Support Customized Job >> > >>> Lineage Listener >> > >>> Hi, >> > >>> I notice that there is an experimental lineage integration for Flink >> > >>> with OpenLineage https://openlineage.io/docs/integrations/flink . >> I >> > >>> think this feature would allow for a superior Flink OpenLineage >> integration, >> > >>> Kind regards, David. >> > >>> >> > >>> From: XTransfer <jiabao....@xtransfer.cn.INVALID> >> > >>> Date: Tuesday, 19 September 2023 at 15:47 >> > >>> To: dev@flink.apache.org <dev@flink.apache.org> >> > >>> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-314: Support Customized Job >> > >>> Lineage Listener >> > >>> Thanks Shammon for this proposal. >> > >>> >> > >>> That’s helpful for collecting the lineage of Flink tasks. >> > >>> Looking forward to its implementation. >> > >>> >> > >>> Best, >> > >>> Jiabao >> > >>> >> > >>> >> > >>> > 2023年9月18日 20:56,Leonard Xu <xbjt...@gmail.com> 写道: >> > >>> > >> > >>> > Thanks Shammon for the informations, the comment makes the >> lifecycle >> > >>> clearer. >> > >>> > +1 >> > >>> > >> > >>> > >> > >>> > Best, >> > >>> > Leonard >> > >>> > >> > >>> > >> > >>> >> On Sep 18, 2023, at 7:54 PM, Shammon FY <zjur...@gmail.com> >> wrote: >> > >>> >> >> > >>> >> Hi devs, >> > >>> >> >> > >>> >> After discussing with @Qingsheng, I fixed a minor issue of the >> > >>> lineage lifecycle in `StreamExecutionEnvironment`. I have added the >> comment >> > >>> to explain that the lineage information in >> `StreamExecutionEnvironment` >> > >>> will be consistent with that of transformations. When users clear >> the >> > >>> existing transformations, the added lineage information will also be >> > >>> deleted. >> > >>> >> >> > >>> >> Please help to review it again, and If there are no more concerns >> > >>> about FLIP-314[1], I would like to start voting later, thanks. cc @ >> > >>> <>Leonard >> > >>> >> >> > >>> >> Best, >> > >>> >> Shammon FY >> > >>> >> >> > >>> >> On Mon, Jul 17, 2023 at 3:43 PM Shammon FY <zjur...@gmail.com >> > >>> <mailto:zjur...@gmail.com>> wrote: >> > >>> >> Hi devs, >> > >>> >> >> > >>> >> Thanks for all the valuable feedback. If there are no more >> concerns >> > >>> about FLIP-314[1], I would like to start voting later, thanks. >> > >>> >> >> > >>> >> >> > >>> >> [1] >> > >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener >> > >>> < >> > >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener >> > >>> > >> > >>> >> >> > >>> >> Best, >> > >>> >> Shammon FY >> > >>> >> >> > >>> >> >> > >>> >> On Wed, Jul 12, 2023 at 11:18 AM Shammon FY <zjur...@gmail.com >> > >>> <mailto:zjur...@gmail.com>> wrote: >> > >>> >> Thanks for the valuable feedback, Leonard. >> > >>> >> >> > >>> >> I have discussed with Leonard off-line. We have reached some >> > >>> conclusions about these issues and I have updated the FLIP as >> follows: >> > >>> >> >> > >>> >> 1. Simplify the `LineageEdge` interface by creating an edge from >> one >> > >>> source vertex to sink vertex. >> > >>> >> 2. Remove the `TableColumnSourceLineageVertex` interface and >> update >> > >>> `TableColumnLineageEdge` to create an edge from columns in one >> source to >> > >>> each sink column. >> > >>> >> 3. Rename `SupportsLineageVertex` to `LineageVertexProvider` >> > >>> >> 4. Add method `addLineageEdges(LineageEdge ... edges)` in >> > >>> `StreamExecutionEnviroment` for datastream job and remove previous >> methods >> > >>> in `DataStreamSource` and `DataStreamSink`. >> > >>> >> >> > >>> >> Looking forward to your feedback, thanks. >> > >>> >> >> > >>> >> Best, >> > >>> >> Shammon FY >> > >>> > >> > >>> >> > >>> Unless otherwise stated above: >> > >>> >> > >>> IBM United Kingdom Limited >> > >>> Registered in England and Wales with number 741598 >> > >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 >> 3AU >> > >>> >> > >>> Unless otherwise stated above: >> > >>> >> > >>> IBM United Kingdom Limited >> > >>> Registered in England and Wales with number 741598 >> > >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 >> 3AU >> > >>> >> > >> >> >