+1 (binding)
Overall I think this will make future development and growth for OL in Airflow 
much easier which will hopefully lead to more adoption!

________________________________
From: Vikram Koka <vik...@astronomer.io.INVALID>
Sent: Monday, February 13, 2023 8:20:23 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-53 OpenLineage in Airflow


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 binding.
I have been looking at the doc and having lineage integrated with Airflow as a 
provider makes sense to me.


On Mon, Feb 13, 2023 at 2:38 AM Kaxil Naik 
<kaxiln...@gmail.com<mailto:kaxiln...@gmail.com>> wrote:
+1 binding , this should make lineage a first-class citizen for Airflow users. 
Excited for this one

On Sun, 12 Feb 2023 at 07:57, Jarek Potiuk 
<ja...@potiuk.com<mailto:ja...@potiuk.com>> wrote:
A little side-track., small comment to what Shubham wrote

Yeah. I also noticed AIP-47 mentioned - but I considered that
implementation detail. I read that those will be rather regular unit
tests (so not reaching out to external systems as it makes little
sense and we definitely want to make open-lineage tests run regularly
with every PR - otherwise we would end up in the same boat as
currently where the repos are separated out), I believe the AIP-47
mentioned there was more an attempt to say "the tests coverage will be
high". Julian, am I right ?

On Sat, Feb 11, 2023 at 11:57 PM Mehta, Shubham
<shu...@amazon.com.invalid> wrote:
>
> +1 non-binding. I'll be on the lookout for initial PRs to learn more about 
> the implementation details of how System Tests will be extended to cover 
> these changes, as well as the ongoing maintenance required from providers. 
> The proposed changes should definitely make it easier for Airflow customers 
> to adopt lineage and improve stability. I'm looking forward to seeing how 
> customers will end up using it!
>
>
> Shubham
>
>
>
> From: Julien Le Dem <jul...@astronomer.io.INVALID>
> Reply-To: "dev@airflow.apache.org<mailto:dev@airflow.apache.org>" 
> <dev@airflow.apache.org<mailto:dev@airflow.apache.org>>
> Date: Friday, February 10, 2023 at 3:28 PM
> To: "dev@airflow.apache.org<mailto:dev@airflow.apache.org>" 
> <dev@airflow.apache.org<mailto:dev@airflow.apache.org>>
> Subject: [EXTERNAL] [VOTE] AIP-53 OpenLineage in Airflow
>
>
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> Dear Airflow community,
>
>
>
> Following the discussion thread over the past few weeks, I'd like to call a 
> vote on AIP-53 OpenLineage in Airflow:
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-53+OpenLineage+in+Airflow
>
>
>
> The discussion thread is linked in the confluence doc if you wish to consult 
> the history of the conversation. Thank you to all who contributed!
>
>
>
> This is my (non-binding!) +1, the vote will last until midnight (UTC) on 
> Friday 17th February.
>
>
>
> Thanks,
>
> Julien
>
>
>
> For reference, the Motivation section in the doc:
>
> Operational lineage collection is a common need to understand dependencies 
> between data pipelines and track end-to-end provenance of data. It enables 
> many use cases from ensuring reliable delivery of data through observability 
> to compliance and cost management.
>
> Publishing operational lineage is a core Airflow capability to enable 
> troubleshooting and governance.
>
> OpenLineage is a project part of the LFAI&Data foundation that provides a 
> spec standardizing operational lineage collection and sharing across the data 
> ecosystem. If it provides plugins for popular open source projects, its 
> intent is very similar to OpenTelemetry (also under the Linux Foundation 
> umbrella): to remain a spec for lineage exchange that projects - open source 
> or proprietary - implement.
>
> Built-in OpenLineage support in Airflow will make it easier and more reliable 
> for Airflow users to publish their operational lineage through the 
> OpenLineage ecosystem.
>
> The current external plugin maintained in the OpenLineage project depends on 
> Airflow and operators internals and gets broken when changes are made on 
> those. Having a built-in integration ensures a better first class support to 
> expose lineage that gets tested alongside other changes and therefore is more 
> stable.
>
> Today, OpenLineage consumers in the ecosystem include: Egeria (bank 
> compliance), Marquez (build your own metadata platform for compliance for 
> example), Microsoft Purview (Governance, …), Astro (data observability), 
> Amundsen. AWS recently blogged about using OpenLineage in the AWS ecosystem. 
> Other projects are at various levels of progress.
>
> On the producer side, there is support for open source projects like Airflow, 
> dbt, Spark, Flink, GreatExpectations and proprietary warehouses like 
> Snowflake, BigQuery, Redshift through API integration or SQL parsing.
>
> Examples of users talking about their usage of OpenLineage can be found on 
> the Openlineage blog..
>
> This integration will also stimulate the continued growth of the OpenLineage 
> ecosystem and create more value for Airflow users.

Reply via email to