Re: [VOTE] AIP-65: Improve DAG history in UI

2024-07-23 Thread Oliveira, Niko
+1 (binding)


From: Brent Bovenzi 
Sent: Tuesday, July 23, 2024 11:10:47 AM
To: dev@airflow.apache.org
Subject: RE: [EXT] [VOTE] AIP-65: Improve DAG history in UI

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 binding

On Mon, Jul 22, 2024 at 1:09 PM Pavankumar Gopidesu 
wrote:

> +1 (non binding).
>
> Regards,
> Pavan
>
>
>
> On Mon, Jul 22, 2024, 18:02 Shahar Epstein  wrote:
>
> > +1 (binding).
> > Highly anticipated feature!
> >
> > On Fri, Jul 19, 2024 at 11:50 PM Jed Cunningham <
> jedcunning...@apache.org>
> > wrote:
> >
> > > I’m calling for a vote on AIP 65:
> > > https://cwiki.apache.org/confluence/x/T4qSEQ
> > >
> > > Discussion thread:
> > > https://lists.apache.org/thread/vvm43tfchyo92hmf40fqvmq0f5845bjr
> > >
> > > The vote will run for 5 days and last till next Wednesday, 2024-07-24
> > 21:00
> > > UTC.
> > >
> > > Everyone is encouraged to vote, although only PMC members and
> Committer's
> > > votes are considered binding.
> > >
> > > This is my +1.
> > >
> >
>


Re: [VOTE] AIP-66: DAG Bundles and Parsing

2024-07-23 Thread Oliveira, Niko
+1 (binding)


From: Shahar Epstein 
Sent: Monday, July 22, 2024 10:02:22 AM
To: dev@airflow.apache.org
Subject: RE: [EXT] [VOTE] AIP-66: DAG Bundles and Parsing

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (binding)

On Fri, Jul 19, 2024 at 11:43 PM Jed Cunningham 
wrote:

> I’m calling for a vote on this AIP 66:
> https://cwiki.apache.org/confluence/x/ZIqSEQ
>
> Discussion thread:
> https://lists.apache.org/thread/l8ksl144xd43jfk1wk3kz77t1xgbbq7z
>
> The vote will run for 5 days and last till next Wednesday,  2024-07-24
> 21:00 UTC.
>
> Everyone is encouraged to vote, although only PMC members and Committer's
> votes are considered binding.
>
> This is my +1.
>


Re: [VOTE] (v2) AIP-69 Remote Executor

2024-07-18 Thread Oliveira, Niko
Overall I'm +1

NOTE: I still strongly believe we should _not_ brand this "Remote Executors" we 
already use Remote Executors (to mean CeleryExecutor, K8sExecutor, etc) in many 
many contexts as a contrast to Local Executors (LocalExecutor, 
SequentialExecutor). It's in our docs, blog posts, Airflow Summit talks, 
everywhere. Overloading this term will confuse users who understand the 
existing terminology. Instead we should go with other terms (also present in 
your description) such as "Distributed Executors", "Decentralized Executors", 
or something else similar.

Great work on this one and it's exciting that it might make it for 2.10!


From: Scheffler Jens (XC-AS/EAE-ADA-T) 
Sent: Tuesday, July 16, 2024 10:36:48 PM
To: dev@airflow.apache.org
Subject: [EXT] [VOTE] (v2) AIP-69 Remote Executor

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hi Developers,

After some further discussion time I’d like to call for a vote for AIP-69. All 
details are described in:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-69+Remote+Executor

Note:

  *   Compared to first VOTE in 
https://lists.apache.org/thread/tyfsrpjn12sz9dw50pbg16dsv6lmj610 more details 
have been added
  *   A PoC PR is available in https://github.com/apache/airflow/pull/40224
  *   Status of progress in 
https://github.com/jscheffl/airflow/blob/feature/aip-69-poc/airflow/providers/remote/TODO.md
  *   Q session was hosted, Notes in 
https://lists.apache.org/thread/h2nxkto0lxgjnqj8yps0qsh7ppbccx6g

Remote Executor should be a special executor for use cases where a distributed 
(non central) setup across different security perimeters need to be achieved 
and a worker accesses the central site only via HTTP(s). It will leverage 
AIP-61 (Hybrid Execution) as well as builds on-top of AIP-44 (at least the 
parts needed for the worker, see PoC PR, it is already working on existing 
structures).
Target is to deliver it with Airflow 2.10 as a Pre-Release. There it can be 
experienced/tested and incrementally be improved. It will integrate in Airflow 
3 with AIP-72 and replace AIP-44 task communication with this.


From the Q meeting main consent was elaborated in a direction of:

- Remote Executor will be marked experimental, not contained in default release 
in 2.10 line

- Even if installed, remote endpoint will be disabled by default to minimize 
risk of exposure

- We would release the provider package only with a version suffix "pre0" to 
PyPi such that an user must explicitly install a pre-release version as manual 
install

- Support and maintenance in Airflow 2.10++ will end with the feature being 
available in Airflow 3 to reduce double maintenance and as motivation to migrate



Why already in 2.10? With the existing structures in Airflow 2.10 we can get 
started, it is already working with limitations. From there we can use it, 
learn on a running system and incrementally enhance and improve.



The vote will run for 6 days and last till next Tuesday 23nd of July 2024 8:00 
UTC.



Everyone is encouraged to vote, although only PMC members and Committer's votes 
are considered binding.



This is my +1.

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | 
jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus 
Forschner,
Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert
​


Re: [VOTE] AIP-72: Task Execution Interface aka Task SDK

2024-07-11 Thread Oliveira, Niko
+1 binding

As Jarek mentioned, still some discussions ongoing but I think that can be 
hashed out later. Looks good overall.


From: Elad Kalif 
Sent: Thursday, July 11, 2024 9:11:19 AM
To: dev@airflow.apache.org
Subject: RE: [EXT] [VOTE] AIP-72: Task Execution Interface aka Task SDK

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 binding

On Thu, Jul 11, 2024 at 5:50 PM Bishundeo, Rajeshwar
 wrote:

> +1 non-binding
>
> -- Rajesh
>
>
>
>
>
>
> On 2024-07-11, 9:57 AM, "Vincent Beck"  vincb...@apache.org>> wrote:
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
>
>
>
> +1 binding
>
>
> On 2024/07/11 13:32:14 Igor Kholopov wrote:
> > +1, non-binding
> >
> > Some alignment with AIP-66 might be required, but the general vision
> > implementation looks clear to me.
> >
> > Thanks for leading this effort!
> >
> > On Thu, Jul 11, 2024 at 3:21 PM Jed Cunningham  >
> > wrote:
> >
> > > +1 binding
> > >
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org  dev-unsubscr...@airflow.apache.org>
> For additional commands, e-mail: dev-h...@airflow.apache.org  dev-h...@airflow.apache.org>
>
>
>
>
>
>


Re: [VOTE] Proposal for adding Telemetry via Scarf

2024-05-09 Thread Oliveira, Niko
+1 binding


From: Jed Cunningham 
Sent: Thursday, May 9, 2024 7:59:03 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Proposal for adding Telemetry 
via Scarf

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 binding


Re: [VOTE] AIP-67 Multi-team deployment of Airflow components

2024-04-18 Thread Oliveira, Niko
+1 binding

Excited for this one!


From: Aritra Basu 
Sent: Thursday, April 18, 2024 7:37:08 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] AIP-67 Multi-team deployment 
of Airflow components

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (non-binding)

--
Regards,
Aritra Basu

On Thu, Apr 18, 2024, 7:42 PM Bishundeo, Rajeshwar
 wrote:

> +1 (non-binding)
>
> -- Rajesh
>
>
>
>
>
>
> On 2024-04-18, 10:11 AM, "Vincent Beck"  vincb...@apache.org>> wrote:
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
>
>
>
> +1 binding
>
>
> On 2024/04/18 11:10:32 Jarek Potiuk wrote:
> > Hello here.
> >
> > I have not not heard a lot of feedback after my last update, so let me
> > start a vote, hoping that the last changes proposed addressed most of the
> > concerns.
> >
> > Just to recap. the proposal is here:
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
> <
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
> >
> >
> > Summarizing the most recent changes - responding to comments and doubts
> > raised during the discussion:
> >
> > * renamed it to be multi-team to clarify that this is the only case it
> > addresses
> > * splitting it into two phases: without and with internal API AIP-44
> (also
> > named as GRPC server)
> > * implifying the approach for Variables and Connections, where no changes
> > in Airflow will be needed to handle the DB updates.
> >
> > This makes phase 1 simpler and not depending on AIP-44.
> >
> > The vote will last till next Friday 26th of April 2024 Noon CEST.
> >
> > J.
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org  dev-unsubscr...@airflow.apache.org>
> For additional commands, e-mail: dev-h...@airflow.apache.org  dev-h...@airflow.apache.org>
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>


Re: [ANNOUNCE] New committer: Wei Lee

2024-04-08 Thread Oliveira, Niko
Congrats Wei! Well deserved :)


From: Scheffler Jens (XC-AS/EAE-ADA-T) 
Sent: Monday, April 8, 2024 2:12:41 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] New committer: Wei Lee

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Welcome to the "club"

@ash Haha in the name a copy error from previous welcome  welcome WEI 拾

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus 
Forschner,
Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert

-Original Message-
From: Ferruzzi, Dennis 
Sent: Monday, April 8, 2024 10:07 PM
To: dev@airflow.apache.org
Subject: Re: [ANNOUNCE] New committer: Wei Lee

Congrats, and welcome!


 - ferruzzi



From: Tomasz Urbaszek 
Sent: Monday, April 8, 2024 12:42 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] New committer: Wei Lee

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Congrats!

On Mon, 8 Apr 2024 at 19:56, Ankit Chaurasia  wrote:

> Congratulations Wei
>
> *Ankit Chaurasia*
> HomePage
>  itchaurasia.info%2F=05%7C02%7CJens.Scheffler%40de.bosch.com%7C754
> ba1d48a68429dc03808dc5807be94%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7
> C0%7C638482037497306802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLC
> JQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=nJnW6X
> u%2BpsMgphLPcOcxAk5rS4PD8a8Z47eh5KqT8RQ%3D=0> |  LinkedIn
>  .linkedin.com%2Fin%2Fsunank200%2F=05%7C02%7CJens.Scheffler%40de.b
> osch.com%7C754ba1d48a68429dc03808dc5807be94%7C0ae51e1907c84e4bbb6d648e
> e58410f4%7C0%7C0%7C638482037497316273%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
> MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7
> C=hkPdqNEsrnUYq0ktM9kwdmV9DYsLaghzjx3y%2FyeivV4%3D=0>
>
>
>
>
>
>
> On Mon, 8 Apr 2024 at 23:38, Pierre Jeambrun 
> wrote:
>
> > Congratulations!
> >
> > Le lun. 8 avr. 2024 à 16:49, Wei Lee  a écrit :
> >
> > > Thanks for all your support  Can’t be more excited 朗
> > >
> > > Best,
> > > Wei
> > >
> > > > On Apr 8, 2024, at 9:15 PM, Vincent Beck 
> wrote:
> > > >
> > > > Congrats Wei! Well deserved!
> > > >
> > > > On 2024/04/08 13:03:50 Hemkumar Chheda wrote:
> > > >> Congratulations Wei! Best news ever 朗拾
> > > >>
> > > >>> On 8 Apr 2024, at 6:10 PM, Bishundeo, Rajeshwar
> > >  wrote:
> > > >>>
> > > >>> Congratulations Wei!! Good job and well deserved!!
> > > >>>
> > > >>> -- Rajesh
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On 2024-04-08, 8:36 AM, "Abhishek Bhakat"
> > >  > > abhishek.bha...@astronomer.io.inva>LID> wrote:
> > > >>>
> > > >>>
> > > >>> CAUTION: This email originated from outside of the
> > > >>> organization. Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> and
> > > know the content is safe.
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> AVERTISSEMENT: Ce courrier électronique provient d’un
> > > >>> expéditeur
> > > externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe
> > > si
> > vous
> > > ne pouvez pas confirmer l’identité de l’expéditeur et si vous
> > > n’êtes
> pas
> > > certain que le contenu ne présente aucun risque.
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> Yay! Congrats Wei!
> > > >>>
> > > >>>
> > > >>> On Mon, Apr 8, 2024 at 11:36 AM Aritra Basu <
> > aritrabasu1...@gmail.com
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>
> > >  Congrats wei! Great job!
> > > 
> > >  --
> > >  Regards,
> > >  Aritra Basu
> > > 
> > >  On Mon, Apr 8, 2024, 

Re: [DISCUSS] DRAFT AIP-67 Multi-tenant deployment of Airflow components

2024-04-05 Thread Oliveira, Niko
Thanks for the reply Gabe! I'm glad you're interested in the topic and weighing 
in for your company.


> @Niko – If I misstated your intent, I hope you will clarify that for me. 
> Thanks!

The intent (DB isolation being critical) was there, but the approach you 
attributed to me is not actually what I'm proposing:

> 1.  A separate db per tenant, as Niko suggests.

I'm not suggesting a separate DB per tenant. AIP-67 along with AIP-44 together 
actually already provide a solution to isolate users from each other at the DB 
level (basically adding the DB API layer and then putting access control on 
that API). I'm just advocating that we _keep_ that in the scope of AIP-67 
proposal because I think that level of isolation is critical.

Cheers,
Niko




From: Gabe Schenz 
Sent: Friday, April 5, 2024 2:55:08 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] DRAFT AIP-67 Multi-tenant 
deployment of Airflow components

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



The outcome that Niko brings up is that any tenants should not be able to 
interact with metadata that is from other tenants within the Airflow 
environment.  If we focus on that as an outcome, what are the various options 
which would support that goal?


  1.  A separate db per tenant, as Niko suggests.
  2.  A separate schema within the metadata database
  3.  Applying row-based access control policies within the metadata db.
  4.  …?

For #3, postgres supports row-level security[1] which could be a sufficient 
means of separation between tenants.  I feel that an approach like this allows 
for efficient use of resources while respecting the security needs of many 
organizations. This seems simpler from an implementation perspective as well.

As a data engineer in an enterprise that has separate Airflow environments for 
each team, I feel that we would greatly benefit from the ability to have a 
single set of resources to manage while allowing tenants to have their own 
separate secrets (connections, variables) and their own access policies (object 
storage and other resources) within our cloud provider’s platform.

@Niko – If I misstated your intent, I hope you will clarify that for me. Thanks!

[1] https://www.postgresql.org/docs/current/ddl-rowsecurity.html

Cheers,
Gabe

--
Gabe Schenz


From: Oliveira, Niko 
Date: Thursday, March 28, 2024 at 12:38 PM
To: dev@airflow.apache.org 
Subject: [EXTERNAL] Re: [DISCUSS] DRAFT AIP-67 Multi-tenant deployment of 
Airflow components
This Message originated outside your organization.

Hey folks, just some thoughts on the topics below:

1) I'm not too fussed about the naming. There has been many years of us 
branding this multitenancy (talks, townhalls, email chains, etc), so a lot of 
our users are already familiar with this name. I'm not sure we'll benefit much 
by changing it, rather we'll mostly just confuse people. But also a more 
accurate name is never a bad thing either.

2) I'm not sold on reducing the scope of the AIP in this way. I think the DB 
isolation is one of the most important pieces. We have spoken with many users 
and customers of MWAA who's top requirement of anything multi tenant is that in 
now way can users query or modify DB records for other tenants. You could put 
XCOM, variables, secrets, connections in different backends of course, but the 
serialized dags are still in the DB never mind dag run history and other 
related bits.

A 2 phased approach like Jarek mentioned could be okay (or in parallel if we 
have the people power to do so), but I'd still like to vote on the whole pack 
as it is rather than reduce the scope in this way.

Just my 2c


From: Jarek Potiuk 
Sent: Wednesday, March 27, 2024 6:04:14 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] DRAFT AIP-67 Multi-tenant 
deployment of Airflow components

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hey Ash (and others who already commented). I hoped, that writing the
proposal down will finally make us all discuss (and eventually converge) on
what "multi-tenancy" means for Airflow and I am gla

Re: [DISCUSS] Consider disabling self-hosted runners for commiter PRs

2024-04-04 Thread Oliveira, Niko
+1I'd love to see this as well.

In the past, stability and long queue times of PR builds have been very 
frustrating. I'm not 100% sure this is due to using self hosted runners, since 
35 queue depth (to my mind) should be plenty. But something about that setup 
has never seemed quite right to me with queuing. Switching to public runners 
for a while to experiment would be great to see if it improves.


From: Pankaj Koti 
Sent: Thursday, April 4, 2024 12:41:02 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Consider disabling 
self-hosted runners for commiter PRs

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 from me to this idea.

Sounds very reasonable to me.
At times, my experience has been better with public runners instead of
self-hosted runners :)

And like already mentioned in the discussion, I think having the ability of
a applying the label "use-self-hosted-runners" to be used for critical
times would be nice to have too.


On Fri, 5 Apr 2024, 00:50 Jarek Potiuk,  wrote:

> Hello everyone,
>
> TL;DR With some recent changes in GitHub Actions and the fact that ASF has
> a lot of runners available donated for all the builds, I think we could
> experiment with disabling "self-hosted" runners for committer builds.
>
> The self-hosted runners of ours have been extremely helpful (and we should
> again thank Amazon and Astronomer for donating credits / money for those) -
> when the Github Public runners have been far less powerful - and we had
> less number of those available for ASF projects. This saved us a LOT of
> troubles where there was a contention between ASF projects.
>
> But as of recently both limitations have been largely removed:
>
> * ASF has 900 public runners donated by GitHub to all projects
> * Those public runners have (as of January) for open-source projects now
> have 4 CPUS and 16GB of memory -
>
> https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/
>
>
> While they are not as powerful as our self-hosted runners, the parallelism
> we utilise for those brings those builds in not-that bad shape compared to
> self-hosted runners. Typical differences between the public and self-hosted
> runners now for the complete set of tests are ~ 20m for public runners and
> ~14 m for self-hosted ones.
>
> But this is not the only factor - I think committers experience the "Job
> failed" for self-hosted runners generally much more often than
> non-committers (stability of our solution is not best, also we are using
> cheaper spot instances). Plus - we limit the total number of self-hosted
> runners (35) - so if several committers submit a few PRs and we have canary
> build running, the jobs will wait until runners are available.
>
> And of course it costs the credits/money of sponsors which we could use for
> other things.
>
> I have - as of recently - access to Github Actions metrics - and while ASF
> is keeping an eye and stared limiting the number of parallel jobs workflows
> in projects are run, it looks like even if all committer runs are added to
> the public runners, we will still cause far lower usage that the limits are
> and far lower than some other projects (which I will not name here).  I
> have access to the metrics so I can monitor our usage and react.
>
> I think possibly - if we switch committers to "public" runners by default
> -the experience will not be much worse for them (and sometimes even better
> - because of stability/limited queue).
>
> I was planning this carefully - I made a number of refactors/changes to our
> workflows recently that makes it way easier to manipulate the configuration
> and get various conditions applied to various jobs - so
> changing/experimenting with those settings should be - well - a breeze :).
> Few recent changes had proven that this change and workflow refactor were
> definitely worth the effort, I feel like I finally got a control over it
> where previously it was a bit like herding a pack of cats (which I
> brought to live by myself, but that's another story).
>
> I would like to propose to run an experiment and see how it works if we
> switch committer PRs back to the public runners - leaving the self-hosted
> runners only for canary builds (which makes perfect sense because those
> builds run a full set of tests and we need as much speed and power there as
> we can.
>
> This is pretty safe, We should be able to switch back very easily if we see
> problems. I will also monitor it and see if our usage is within the limits
> of the ASF. I can also add 

Re: [DISCUSS] DRAFT AIP-67 Multi-tenant deployment of Airflow components

2024-03-28 Thread Oliveira, Niko
Hey folks, just some thoughts on the topics below:

 1) I'm not too fussed about the naming. There has been many years of us 
branding this multitenancy (talks, townhalls, email chains, etc), so a lot of 
our users are already familiar with this name. I'm not sure we'll benefit much 
by changing it, rather we'll mostly just confuse people. But also a more 
accurate name is never a bad thing either.

 2) I'm not sold on reducing the scope of the AIP in this way. I think the DB 
isolation is one of the most important pieces. We have spoken with many users 
and customers of MWAA who's top requirement of anything multi tenant is that in 
now way can users query or modify DB records for other tenants. You could put 
XCOM, variables, secrets, connections in different backends of course, but the 
serialized dags are still in the DB never mind dag run history and other 
related bits.

A 2 phased approach like Jarek mentioned could be okay (or in parallel if we 
have the people power to do so), but I'd still like to vote on the whole pack 
as it is rather than reduce the scope in this way.

Just my 2c


From: Jarek Potiuk 
Sent: Wednesday, March 27, 2024 6:04:14 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] DRAFT AIP-67 Multi-tenant 
deployment of Airflow components

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hey Ash (and others who already commented). I hoped, that writing the
proposal down will finally make us all discuss (and eventually converge) on
what "multi-tenancy" means for Airflow and I am glad we have different
opinions - both about naming and scope of the change.

I think those are two things that are worth to discuss a bit separately:

1) "Naming" - Is it enough for "team separation"  to use the name
"multi-tenancy" ? I think that one deserves even something like a poll or
eventually voting to see what's a general perception of what "tenancy" and
particularly "multi-tenancy" is. Over the last few weeks I've been talking
to a number of parties (including some offline talks) and it seems that
"multi-tenancy" has completely different co-notation for people - depending
where they come from. For most maintainers it seems what the proposal is
about is "not multi-tenancy", for those who would like to serve different
customers - it's also not "multi-tenancy", but for most of the "platform
teams" I spoke to (those who manage an internal deployment of airflow for a
big organisation with multiple teams/departments - this is "precisely
multi-tenancy".

I am absolutely happy if we use different name eventually, I am not at all
connected to multi-tenancy as a name, and I am happy with for example
"multi-team airflow deployment". I started to like the name actually as it
more explicitly explains what the proposal is (and it still keeps the
multi-"t..." ). And since there are so many people who get confused about
it - I think I would run quick poll during the voting period where I'd add
("and how would you like it to be named" - and I can go either way (if of
course we agree this is a good AIP to be approved in general).

2) Scope. Yes. I hear your point, and yes - it's actually quite possible
already what you propose with a smaller number of changes. But I've heard
from a number of of the users I spoke to - that they are struggling with
two things:

a) having DAG authors from one team to be able to access credentials for
the other team (connections)
b) having dependencies/environment of one team conflicting with
dependencies of the other team
c) having DAG authors from one team to be able to (accidentally or on
purpose) to influence the code of other teams

I think (and just to distill down the "simplification proposal" - what Ash
is explaining is basically, describing a similar setup as AIP-67 DRAFT now
proposes but without `internal-api` component (and let me know Ash if I got
it right).

As I understand the `simplification` proposal without internal-API - it
would make it possible to address a) - by forcing secrets use (not DB) -
and using a combination of deployment "guidelines" explained in the
proposal: per-team separation of queues/executors/triggerer/workers and
providing different configuration per team. Where each team will have a
different set of credentials to access secrets (this is really the only way
we can prevent DAG authors from accessing other team credentials if they
have unbound access to the DB). Similarly, having support for different
executors for different tenants could address b) much more easily than
today - introducing the concept 

Re: [VOTE] AIP-64: Keep TaskInstance try history

2024-03-26 Thread Oliveira, Niko
+1 (binding) Glad this is finally getting some love!


From: Ankit Chaurasia 
Sent: Tuesday, March 26, 2024 2:58:13 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] AIP-64: Keep TaskInstance try 
history

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1, Looking forward to this.

*Ankit Chaurasia*
HomePage  |  LinkedIn
 |  +91-9987351649






On Tue, 26 Mar 2024 at 15:02, Phani Kumar 
wrote:

> +1 binding - looking forward to this one
>
> On Tue, Mar 26, 2024 at 1:11 PM Amogh Desai 
> wrote:
>
> > +1 binding
> >
> > I like the thought behind this and understand why this was repeatedly
> asked
> > by the community.
> > Good work on the proposal so far!
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> >
> > On Tue, Mar 26, 2024 at 9:58 AM Rahul Vats 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > Regards,
> > > Rahul Vats
> > > 9953794332
> > >
> > >
> > > On Mon, 25 Mar 2024 at 22:46, Jed Cunningham  >
> > > wrote:
> > >
> > > > Hello Airflow Community,
> > > >
> > > > I would like to start a vote on AIP-64: Keep TaskInstance try
> history.
> > > >
> > > > You can find the AIP here:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-64%3A+Keep+TaskInstance+try+history
> > > >
> > > > Discussion Thread:
> > > > https://lists.apache.org/thread/vvm43tfchyo92hmf40fqvmq0f5845bjr
> > > >
> > > > This is the first step in the AIP-63 DAG Versioning journey, though
> > this
> > > > provides value in isolation as well:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-63%3A+DAG+Versioning
> > > >
> > > > The vote will last until 2024-03-28 17:30 UTC and until at least 3
> > > binding
> > > > votes have been cast.
> > > >
> > > > Consider this my binding +1.
> > > >
> > > > Please vote accordingly:
> > > >
> > > > [ ] + 1 approve
> > > > [ ] + 0 no opinion
> > > > [ ] - 1 disapprove with the reason
> > > >
> > > > Only votes from PMC members and committers are binding, but other
> > members
> > > > of the community are encouraged to check the AIP and vote with
> > > > "(non-binding)".
> > > >
> > > > Thanks,
> > > > Jed
> > > >
> > >
> >
>


Re: [DISCUSS] Applying D105 rule for our codebase ("undocumented magic methods") ?

2024-03-20 Thread Oliveira, Niko
I'm -1 to enabling D105


I don't think it will lead to helpful documentation. I think for the rare cases 
it is required it can left up to the developer or caught in PR review.

Cheers,
Niko


From: Vincent Beck 
Sent: Wednesday, March 20, 2024 5:51:43 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Applying D105 rule for our 
codebase ("undocumented magic methods") ?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 for not enforcing as well. Let's leave to maintainers the flexibility to 
chose whether a given method should be documented.

On 2024/03/20 08:28:51 Ash Berlin-Taylor wrote:
> I'm for not enforcing this rule - as others have said its very unlikely to 
> result in more useful docs for developers or end users.
>
> -asg
>
> On 20 March 2024 08:12:40 GMT, Andrey Anshin  wrote:
> >±0 from my side
> >
> >Maybe we have to review all current methods which do not follow this rule
> >to find a really useful meaning, and do not enforce (disable it).
> >So for avoid unnecessary changes we might close
> >https://github.com/apache/airflow/issues/37523 and remove/mark completed
> >into the https://github.com/apache/airflow/issues/10742
> >
> >
> >
> >
> >
> >
> >On Wed, 20 Mar 2024 at 10:41, Pankaj Koti 
> >wrote:
> >
> >> +1 to what Aritra is saying.
> >>
> >>
> >> Best regards,
> >>
> >> *Pankaj Koti*
> >> Senior Software Engineer (Airflow OSS Engineering team)
> >> Location: Pune, Maharashtra, India
> >> Timezone: Indian Standard Time (IST)
> >> Phone: +91 9730079985
> >>
> >>
> >> On Wed, Mar 20, 2024 at 12:05 PM Aritra Basu 
> >> wrote:
> >>
> >> > I'm in general not a huge fan of documenting for the sake of documenting,
> >> > so I'd be in agreement of not enforcing it via code but rather be
> >> enforced
> >> > by the reviewers in cases they believe certain methods need documenting.
> >> >
> >> > --
> >> > Regards,
> >> > Aritra Basu
> >> >
> >> > On Wed, Mar 20, 2024, 9:39 AM Jarek Potiuk  wrote:
> >> >
> >> > > Hey here,
> >> > >
> >> > > I wanted to quickly poll what people think about applying the
> >> > > https://docs.astral.sh/ruff/rules/undocumented-magic-method/ rule in
> >> our
> >> > > codebase. There are many uncontroversial rules - but that one is
> >> somewhat
> >> > > more controversial than others.
> >> > >
> >> > > See
> >> https://github.com/apache/airflow/pull/37602#issuecomment-2001951402
> >> > > and
> >> > >
> >> >
> >> https://github.com/apache/airflow/pull/38277#pullrequestreview-1945745542
> >> > > for example
> >> > >
> >> > > I think that even in the ruff example, but also in many cases requiring
> >> > to
> >> > > document the methods will lead to rather useless documentation:
> >> > >
> >> > > class Cat(Animal):
> >> > > def __str__(self) -> str:
> >> > > """Return a string representation of the cat."""
> >> > > return f"Cat: {self.name}"
> >> > >
> >> > > There is IMHO very little value in having such documentation. It might
> >> be
> >> > > useful in some cases where we have a really good reason to add such a
> >> > magic
> >> > > method and it is important to document it, but in many cases - the
> >> > > documentation will be just documenting what the magic method already
> >> > > explains well (like the case above).
> >> > >
> >> > > This actually reminds me the early days of java documentation where
> >> > javadoc
> >> > > looks more or less like this:
> >> > >
> >> > > "Paints the object"
> >> > > func paint()
> >> > >
> >> > > "Repaints the object"
> >> > > fund repaint()
> >> > >
> >> > > However - maybe I am wrong :). Maybe it's worth documenting those
> >> methods
> >> > > in bulk, even if in many cases it will not bring much value?
> >> > >
> >> > > WDYT ? Should we mandate documenting it - or leave up to the author to
> >> > > document it in case it feels like it is needed?
> >> > >
> >> > > J.
> >> > >
> >> >
> >>
>

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [DISCUSS] Considering trying out uv for our CI workflows

2024-02-27 Thread Oliveira, Niko
Fantastic results!

> It also means that if you've been using breeze and were sometimes afraid to

> hit "y" to rebuild the image, being afraid that it will take 20 minutes or
> so - not any more. It should be WAY faster now.

I'm very excited about this speed up as well as our CI :)


From: Jarek Potiuk 
Sent: Tuesday, February 27, 2024 2:44:14 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying out uv 
for our CI workflows

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Summarising where we are:

After ~24 hrs of operations, it looks really cool and fulfills (and
actually exceeds) all my expectations.

* Multiple PRs succeeded, we got quite a few constraints updated
automatically after successful canary runs:
https://github.com/apache/airflow/commits/constraints-main/ (and they look
perfectly fine - pretty much what I'd expect)
* I looked through a number of image builds in "canary" runs and the
regular 10-12 minutes build-image jobs are down to 3-4 minutes
* I just did an experiment and on my machine I run a complete from the
scratch CI image with new dependencies build for breeze (with `breeze ci
image build --python 3.9 --docker-cache disabled
--upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch
where we do not have the change applied yet

Results (on my desktop machine (16 cores, network 1Gb download and very
fast disk):

* v2-8-test: 730 s -> *12 minutes *
* main: 227 s -> less than *4 minutes (!)*

That's 70% (!) faster. This is a complete full rebuild of the image,
including installing all dependencies from the scratch and attempting to
upgrade them to the latest compatible versions. That is the WORST case.
Of course it will vary - depending on the network speed you have and number
of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for
downloads and installation and that is one of the reasons why the
difference is so huge). I'd love to hear the results of such comparisons
from others with different machines/networking/disks - to get a bit more
scientific data points.

It also means that if you've been using breeze and were sometimes afraid to
hit "y" to rebuild the image, being afraid that it will take 20 minutes or
so - not any more. It should be WAY faster now.

I will also proceed to attempt to use the `--resolution lowest` soon and
try to see if we can have a nice automation in place to bump our
min-versions to the "actually working" versions - for all our extras. That
would be a major win for our users - as there will never be a case in the
future that they upgrade airflow to a newer version and some old dependency
remains and is not compatible. It does not happen often,

Seeing the speed difference - I am actually going now to regularly use `uv
pip` for any local installation as well - it should save a LOT of time -
especially that if you have multiple environments, it keeps a single cache
for all your installed packages (and their metadata) - this means that if
you have several virtualenvs installed and switch between them, the
installation and reinstallation of packages between those packages should
be lightning fast (like single seconds rather than 10s of seconds for
smallest installation). I'd heartily recommend it to anyone.

Let's see about the stability. I know there are few edge-cases that are not
handled well - Damian helpfully pointed out to the "apache-airflow[all]"
case that currently is problematic, so I will keep an eye on new versions
and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are
shielded from any potential stability problems and we will need to manually
upgrade to newer versions when they appear).

J.


Re: [DISCUSS] Considering trying out uv for our CI workflows

2024-02-21 Thread Oliveira, Niko
The Astral folks also seem very focused on it being a drop-in/compliant 
replacement for pip. So I think it's definitely worth dropping it in and seeing 
if we get the expected performance improvements. If tests still pass and user 
facing constraints and install instructions remain unchanged I don't see why 
not, if someone is willing to spend the time on it. Never mind the extra 
features it would give us (I, like others, am also very excited about 
--resolution=lowest, ability).


From: Andrey Anshin 
Sent: Tuesday, February 20, 2024 12:26:56 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying out uv 
for our CI workflows

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



> I share Andrey's skepticism. It's just yet another tool which has an unclear
development strategy.

My point was more about a matter of presentation. If someone told you "this
is a new tool, like a killer of previous tools" then you might think
"Yeah...yeah...yeah.. yet another replacement to tool X...  not really
interesting". On the other hand if someone told you what in cases you might
solve, then this might be a mind changer.

Especially the promising `--resolution=lowest` option. We always want to
test something with minimal dependencies because we are not sure that it
might work with pretty old dependencies, and recently I've started to work
on POC to collect minimal versions of the Airflow and Providers. And at the
moment when I almost finished it the uv was released. Well sometimes it is
better to wait a bit and maybe someone would invent the same
solution  and you don't have to spend a personal time.

So as POC I'm on it, we still need a `pip` and validate some stuff by a pip
because it is only one officially supported way to install Airflow but if
something could be improved in the CI then I'm on it, in most cases it
would be behind of Breeze and many of the contributors might be even not
noticed that something changed.








On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk  wrote:

> Actually - of you read that blog post, the strategy is clear - they aim to
> create a comprehensive packaging tooling and improvnts are measured (80-100
> times they claim - I using caching - they (unlike pip) use a lot of local
> caching including resolving  dependencies).
>
> So I think both arguments are not valid if you ask me.
>
> wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin 
> napisał:
>
> > I share Andrey's skepticism. It's just yet another tool which has an
> > unclear development strategy. Should you make it a free testing suite?
> What
> > project would receive in exchange? A lot of words about being faster, but
> > how much? Are these milliseconds worth to change the stable tool with a
> new
> > one? And will it notably improve something?
> >
> > I think it's worth to try it just for fun and provide feedback, but it'll
> > have to pass a long road to become such stable as pip.
> >
> > --
> > ,,,^..^,,,
> >
> >
> > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk  wrote:
> >
> > > My opinion:
> > >
> > > I think there is a place for a number of such tools. For a long time
> the
> > > packaging team and `pip` team have been working not only on `pip`
> > > implementation but also (and most importantly) to make sure that what
> > `pip`
> > > does is to be the beacon of standardisation of packaging APIs and PEPs.
> > It
> > > will never IMHO have a lot of the fancy features that other tools might
> > > provide (like the ones I mentioned). It will always be there to provide
> > the
> > > robust and solid CLI to run all packaging things, but there are plenty
> of
> > > opportunities to provide improved or modified, or more (or less)
> > > opinionated ways of doing things that are addressing some cases that
> > `pip`
> > > team simply will not be able or willing to handle, preferring "pure"
> > > standard approach vs. implement all the optional things. For example
> the
> > > way how pre-releases are handled can be improved to be more selective.
> > The
> > > PEP describing it gives the tools an option to add more fancy
> behaviours
> > > (some of which we could find useful in our CI tooling). Should `pip`
> > > implement those - I don't think so. It would distract maintainers from
> > > other more important things. It is quite ok to use other tooling in
> > places
> > > like our CI, where they do some parts of the installation better.
> > >
> > > For me `pip` is going more into the direction of `usable reference
> > > implementation of package installed` - any standard/ PEP 

Re: [DISCUSS] Rename channels on slack

2024-02-12 Thread Oliveira, Niko
I'm happy with that set of names +1


From: Jarek Potiuk 
Sent: Monday, February 12, 2024 3:37:35 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Rename channels on slack

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Yeah. Precisely I think the persona is the key problem - similarly to the
recent change in our community page https://airflow.apache.org/community/
where we split "Want to contribute?" and "Are you a user?" (and an
observation that #development is in "user" section does not make it
clearer). So I'd avoid "development" at all.

I like Niko's idea about using personas as channel names.

How about this - not very short, but pretty explicit:

#contributors (with archiving current #development and announcement there)
#new-contributors (renamed from #first-pr-support)
#user-troubleshooting
#user-best-practices (let's see how popular it will be - we can close it if
not used)
#random

J.




On Mon, Feb 12, 2024 at 12:22 PM Bolke de Bruin  wrote:

> The challenge is the persona here. I bet people think that they are doing
> "dev" on Airflow when writing dags :-).
>
> On Fri, 9 Feb 2024 at 20:17, Briana Okyere
>  wrote:
>
> > Love this thread. From my experience in community management (for
> whatever
> > it's worth), naming channels after personas is the way to go. I also
> think
> > we can consolidate some of these. I think:
> >
> > #dev-support -> from #development
> > #troubleshooting
> > #best-practices
> > #random
> > #first-pr-support --> change to #contributing
> >
> > On Fri, Feb 9, 2024 at 10:33 AM Oliveira, Niko
>  > >
> > wrote:
> >
> > > I'm not too picky on the names, but I'm +5 to Jed's approach of just
> > > archiving the current development channel and starting fresh with a new
> > > channel for contribution. There no manner of rebranding that we can do
> to
> > > save #development now, with that many folks in it.
> > >
> > > I'll throw #contributors into the ring as a name for the new
> development
> > > channel. I like naming it after a persona. Then as a user you know that
> > if
> > > you're trying to contribute that's where you should hang out.
> > >
> > > 
> > > From: Aritra Basu 
> > > Sent: Thursday, February 8, 2024 9:50:42 PM
> > > To: dev@airflow.apache.org
> > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Rename channels on
> > > slack
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > Hmm, I'm not a big fan of dev support since I would see it as support
> > for a
> > > developer using airflow vs a developer developing airflow. I much
> prefer
> > > contributing since it makes it clear it's for someone contributing to
> > > airflow.
> > >
> > > --
> > > Regards,
> > > Aritra Basu
> > >
> > > On Fri, Feb 9, 2024, 10:42 AM Amogh Desai 
> > > wrote:
> > >
> > > > Good discussions going on. Personally I align with what Jed, Vincent
> > and
> > > > Jarek had to say.
> > > >
> > > > My choices would be:
> > > > #dev-support -> from #development
> > > > #troubleshooting
> > > > #best-practices
> > > > #random
> > > > #first-pr-support
> > > >
> > > > But yes, maybe we can change it to something better as well? Any
> ideas?
> > > >
> > > > Why not just #first-pr
> > > >
> > > > Or maybe someone can propose a better name for #contributing that is
> > 

Re: [DISCUSS] Rename channels on slack

2024-02-09 Thread Oliveira, Niko
I'm not too picky on the names, but I'm +5 to Jed's approach of just archiving 
the current development channel and starting fresh with a new channel for 
contribution. There no manner of rebranding that we can do to save #development 
now, with that many folks in it.

I'll throw #contributors into the ring as a name for the new development 
channel. I like naming it after a persona. Then as a user you know that if 
you're trying to contribute that's where you should hang out.


From: Aritra Basu 
Sent: Thursday, February 8, 2024 9:50:42 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Rename channels on slack

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hmm, I'm not a big fan of dev support since I would see it as support for a
developer using airflow vs a developer developing airflow. I much prefer
contributing since it makes it clear it's for someone contributing to
airflow.

--
Regards,
Aritra Basu

On Fri, Feb 9, 2024, 10:42 AM Amogh Desai  wrote:

> Good discussions going on. Personally I align with what Jed, Vincent and
> Jarek had to say.
>
> My choices would be:
> #dev-support -> from #development
> #troubleshooting
> #best-practices
> #random
> #first-pr-support
>
> But yes, maybe we can change it to something better as well? Any ideas?
>
> Why not just #first-pr
>
> Or maybe someone can propose a better name for #contributing that is less
> > ambiguous?
> >
> What do you all think of #dev-support? This reason the confusion arises is
> because #contributing can translate to many
> meanings.
>
> Thanks & Regards,
> Amogh Desai
>
>
>
>
> On Fri, Feb 9, 2024 at 4:45 AM Jarek Potiuk  wrote:
>
> > #learn-contributing maybe ?
> >
> > On Fri, Feb 9, 2024 at 12:14 AM Jarek Potiuk  wrote:
> >
> > > Vincent:
> > >
> > > > I think we can modify that after we renamed the channels.
> > >
> > > Yes, definitely.
> > >
> > > Alexander:
> > >
> > > > Why a special channel for the first PR?
> > >
> > > Yeah. I do not particularly like this name either. It used to be called
> > > #newbie-questions - which was a bit intimidating (especially for
> > > experienced programmers who wanted to do their first contributions)
> > > But yes, maybe we can change it to something better as well? Any ideas?
> > >
> > > J.
> > >
> > >
> > >
> > >
> > > On Thu, Feb 8, 2024 at 11:43 PM Vincent Beck 
> > wrote:
> > >
> > >> I like #contributing. Also, I just realized that the confusion around
> > >> #development might come from https://airflow.apache.org/community/
> > where
> > >> #development should be listed under "Want to contribute?" and not "Are
> > you
> > >> a user?". I think we can modify that after we renamed the channels.
> > >>
> > >> On 2024/02/08 22:35:20 Jarek Potiuk wrote:
> > >> > Speaking of which (#development-first-pr-support) - it indeed
> creates
> > >> > potential confusion,. because #contributing might be ambiguous. But
> > >> likely
> > >> > this will be far less of a problem and we can do just that (shorter
> is
> > >> > better):
> > >> >
> > >> > #contributing
> > >> > #troubleshooting
> > >> > #best-practices
> > >> > #random
> > >> > #first-pr-support
> > >> >
> > >> > Or maybe someone can propose a better name for #contributing that is
> > >> less
> > >> > ambiguous?
> > >> >
> > >> > J.
> > >> >
> > >> >
> > >> >
> > >> > On Thu, Feb 8, 2024 at 10:35 PM Scheffler Jens (XC-AS/EAE-ADA-T)
> > >> >  wrote:
> > >> >
> > >> > > I was +1 to Jarek first but reading the counter proposal from
> > Vicent I
> > >> > > change to +1 like Vincent proposes. Also not against Jarek but
> > >> shorter is
> > >> > > better.
> > >> > >
> > >> > > Sent from Outlook for iOS
> > >> > > 
> > >> > > From: Vincent Beck 
> > >> > > Sent: Thursday, February 8, 2024 3:29:24 PM
> > >> > > To: dev@airflow.apache.org 
> > >> > > Subject: Re: [DISCUSS] Rename channels on slack
> > >> > >
> > >> > > I am +1 in renaming these channels because, as said, most of
> > messages
> > >> in
> > >> > > @development are nothing to do there.
> > >> > >
> > >> > > Though, I would just rename #development to #contributing. To me,
> > >> > > #troubleshooting is already a good name and clear. But this is
> only
> > my
> > >> > > personal opinion. I am not against the names Jarek suggested.
> > >> > >
> > >> > > On 2024/02/08 11:54:34 Jarek Potiuk wrote:
> > >> > > > Hey here,
> > >> > > >
> > >> > > > The number of "troubleshooting/best-practices" questions we have
> > in
> > >> > > > #development channel on Slack reached the level where we 

[RESULT][VOTE] AIP-61: Hybrid Execution

2024-02-06 Thread Oliveira, Niko
Hey folks!

The voting for AIP-61: Hybrid Execution 
(https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-61+Hybrid+Execution) 
was completed on February 06, 2024 PST, and I am happy announce the following 
voting result:

*Binding (+8) Votes*
Jens Scheffler
Jarek Potiuk
Amogh Desai
Dennis Ferruzzi
Sumit Maheshwari
Hussein Awala
Andrey Anshin
Ash Berlin-Taylor


*Non-binding (+7) Votes*
Aritra Basu
Wei Lee
Igor Kholopov
Ryan Hatter
Shubham Mehta
Rajesh Bishundeo
Eugen Kosteev


I would like to thank all the above who participated in this voting!
Link to the vote thread: 
https://lists.apache.org/thread/mkdlskz6tb0rbw36vglh54kfghl69kxs




[VOTE] AIP 61 - Hybrid Executors

2024-01-31 Thread Oliveira, Niko
Hey folks,


The AIP for Hybrid Executors has been out for a few weeks now. Some great 
feedback came in and some challenges to scope which I think have all been 
addressed, and the AIP document has been updated where applicable.


At this point I'd like to call a vote, and if all goes well, begin development 
soon!


You can find the AIP here:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-61+Hybrid+Execution


Discussion threads:
https://lists.apache.org/thread/94sg7l4m3qjk4b3vfq3lr94oc5fs9q4j

The voting will last for 6 days (until 6th of February 2024, 22:00 PST), and 
until at least 3 binding votes have been cast.

Please vote accordingly:

[ ] + 1 approve
[ ] + 0 no opinion
[ ] - 1 disapprove with the reason

Only votes from PMC members and committers are binding, but other members of 
the community are encouraged to check the AIP and vote with "(non-binding)".

Thanks!




Re: [ANNOUNCE] Starting experimenting with "Require conversation resolution" setting

2024-01-30 Thread Oliveira, Niko
Yeah, I've found this to be pretty smooth as well. In most cases comments were 
already resolved and in lesser cases it was useful to see which conversations 
still needed addressing before merging. +1 from me!


From: Ryan Hatter 
Sent: Tuesday, January 30, 2024 8:58:30 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] Starting experimenting 
with "Require conversation resolution" setting

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



In my experience outside of Airflow, the benefit of not missing a review
comment outweighs the friction of being required to resolve each
conversation.

On Mon, Jan 29, 2024 at 8:47 PM Wei Lee  wrote:

> I didn't notice much of a difference as a contributor. +1 vote
>
> Best,
> Wei
>
> > On Jan 30, 2024, at 11:41 AM, Amogh Desai 
> wrote:
> >
> > Contrary to my initial expectation of the trouble this would bring in for
> > reviewers, it has been
> > pretty nice. I have not faced any issues in marking the conversations as
> > resolved for the pull
> > requests I have reviewed and it has even given me a chance to re review
> > prior to approval.
> >
> > I am happy with this overall and my vote will be a +1
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> > On Mon, Jan 29, 2024 at 7:56 PM Aritra Basu 
> > wrote:
> >
> >> I personally haven't had too much friction due to the change and it has
> >> helped me keep track of any comments people have made. I remain +1 to
> the
> >> change so far.
> >>
> >> --
> >> Regards,
> >> Aritra Basu
> >>
> >> On Mon, Jan 29, 2024, 6:11 PM Jarek Potiuk  wrote:
> >>
> >>> Just wanted to remind everyone, we are nearing the end of the trial
> >> period
> >>> for "require conversation" feature to be enabled. I have my own
> >>> observations and examples, but since I was the one to propose it, I am
> >>> likely biased, so I'd love to hear from others what their feedback and
> >>> assessment is. Or maybe we need more time to assess it ?
> >>>
> >>> I would love to hear your thoughts.
> >>>
> >>> J,
> >>>
> >>>
> >>> On Sat, Dec 30, 2023 at 2:20 PM Jarek Potiuk  wrote:
> >>>
>  After an initial indentation problem in .asf.yaml it's not working as
>  expected. So  let's see how resolving conversations will work for
> >> us.
> 
>  On Sat, Dec 30, 2023 at 12:17 PM Amogh Desai <
> amoghdesai@gmail.com
> >>>
>  wrote:
> 
> > Wooho! Looking to see how this turns out for airflow 
> >
> > On Sat, 30 Dec 2023 at 1:35 PM, Jarek Potiuk 
> >> wrote:
> >
> >> Hello everyone,
> >>
> >> As discussed in
> >> https://lists.apache.org/thread/cs6mcvpn2lk9w2p4oz43t20z3fg5nl7l I
> >>> just
> >> enabled "require conversation resolution" for our main/stable
> >>> branches.
> > We
> >> have not used it in the past so it might not work as we think or we
> > might
> >> need to tweak something.
> >>
> >> Generally speaking (if all works) all conversations on PRs should be
> >> resolved before we can merge the PR. This "resolving" is encouraged
> >> to
> > be
> >> done by the author when they think the conversation is resolved, but
> >>> it
> > can
> >> also be done by reviewers or the maintainer who wants to merge the
> >> PR.
> >>
> >> We attempted to describe some basic rules and expectations here:
> >>
> >>
> >
> >>>
> >>
> https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#step-5-pass-pr-review
> >> but undoubtedly there will be questions and issues that we might
> >> want
> >>> to
> >> solve - so feel free to discuss it here or raise question/issues in
> >> #development channel in slack (I am also happy to be pinged directly
> > about
> >> it and help to resolve any issues/gather feedback).
> >>
> >> J.
> >>
> >
> 
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>


Re: [VOTE] Accept AIP-60 (Standard URI representation for Airflow Datasets)

2024-01-22 Thread Oliveira, Niko
+1 (binding)


From: Wei Lee 
Sent: Sunday, January 21, 2024 5:24:54 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Accept AIP-60 (Standard URI 
representation for Airflow Datasets)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (non-binding)

Best,
Wei

> On Jan 22, 2024, at 8:16 AM, Jarek Potiuk  wrote:
>
> +1 (binding).
>
> On Fri, Jan 19, 2024 at 12:38 AM Igor Kholopov 
> wrote:
>
>> +1 (non-binding)
>>
>> On Fri, Jan 19, 2024 at 12:03 AM Maciej Obuchowski >>
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Thu, Jan 18, 2024, 22:51 Scheffler Jens (XC-AS/EAE-ADA-T)
>>>  wrote:
>>>
 +1 binding as discussed - looking forward for this and THANKS!

 Sent from Outlook for iOS
 
 From: Tzu-ping Chung 
 Sent: Thursday, January 18, 2024 9:07:42 AM
 To: dev@airflow.apache.org 
 Subject: [VOTE] Accept AIP-60 (Standard URI representation for Airflow
 Datasets)

 AIP page:

>>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FAIRFLOW%2FAIP-60%2BStandard%2BURI%2Brepresentation%2Bfor%2BAirflow%2BDatasets=05%7C02%7CJens.Scheffler%40de.bosch.com%7Cfeb80ec5aa4b4fa99c7a08dc17fc9ae4%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638411620901637967%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=ZJ3XtYmB5k5NsO%2Ft%2F05QSxH9CIrYEiP4td09LZZ8rcI%3D=0
 <

>>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-60+Standard+URI+representation+for+Airflow+Datasets
>
 Discussion thread:

>>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Frf6c80ljjkml0l15h2jys7k713q3os1d=05%7C02%7CJens.Scheffler%40de.bosch.com%7Cfeb80ec5aa4b4fa99c7a08dc17fc9ae4%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638411620901637967%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QQqR1gaRDfC9udbGbMubPfkyt73jSUmB7uPU%2BukCE2s%3D=0
 

 Reaction on the proposal seems to mostly positive, with most comments
 around what documentation should be added, and the exact criteria the
>> AIP
 should be considered “done”. I believe I have addressed most of them;
>>> most
 notably, additional sentences have been added to the What defines this
>>> AIP
 as "done"? section to require the best practice to be demonstrated by
 example DAGs and tutorials in the documentation.

 One comment I left unaddressed is about auto-generating documentation
>>> from
 providers. This is mostly because I’m not quite sure how it can be
 practical. We can generate a list of supported protocols (s3, gcs,
>> file,
 etc.), but that is not particularly useful to users without the actual
 format the URI would use. In the current implementation, each URI
>> handler
 is a simple Python function, and it is not viable to extract logic from
>>> it
 unless we adopt some kind of rule-based parser (like regex, and even
>> that
 is too complex to automatically generate documentation from). I am open
>>> to
 suggestions on this, so feel free to give a -1 with an idea on this.
 Otherwise I would move the proposal forward without auto documentation
 generation.

 This vote will be kept open for more than 72 hours even if three +1s
>> are
 reached, to gather potential ideas on the documentation thing mentioned
 above. I intend to start implementation (including the example DAGs) in
>>> the
 mean time.

 TP

>>>
>>



Re: AIP-61 - Hybrid Executors

2024-01-18 Thread Oliveira, Niko
Hey Ryan,

Thanks for the reply! I'll make that note more clear in the AIP. It's a nuanced 
point, and the wording is a bit vague at best right now or borderline 
misleading ><

What I'm trying to say here is that if you don't configure a specific executor 
for a task, it will run on the default/environment level executor and for each 
attempt that will be the case. But I was just pointing out that it's also true 
that the default executor can be reconfigured to a different executor at any 
time (with a scheduler restart of course) so that env/default executor can be 
_any_ executor the user has currently configured. And if that's done in the 
middle of a DAG run, or between retries, etc the default level could be 
different than it was before. And note that this is how Airflow behaves today 
already.

Does that clear things up? Let me know if it doesn't and I'll have a third go 
at it :)

Cheers,
Niko


From: Ryan Hatter 
Sent: Thursday, January 18, 2024 1:27:00 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] AIP-61 - Hybrid Executors

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



>
> *IMPORTANT NOTE*: task instances that run on the default/environment
> executor (i.e. with no specific override provided) will not persist the
> executor in the same way so that they can be re-run/retried on any executor.


Does this mean that any task that doesn't have the `executor` parameter
specified will run on the default executor for its first attempt, but any
executor on subsequent attempts? If so, how will users specify the default
executor? Also, I think this will provide some debugging challenges when a
task runs on one executor on one attempt, but a different executor on a
different attempt. It might be nice if users could set a
`use_default_executor_on_retry` kind of parameter.

On Mon, Jan 15, 2024 at 1:05 PM Oliveira, Niko 
wrote:

> Hey folks!
>
> I'd like to announce a new proposal for Airflow. I've teased this before
> in my talk on executors at this year's summit and it was also mentioned in
> the townhall last week.
>
> It's a proposal to allow using multiple executors concurrently within a
> single Airflow environment.
>
>
> Let me know what you think!
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-61+Hybrid+Execution
>
> Cheers,
> Niko
>


Re: [ANNOUNCE] New PMC member: Andrey Anshin (taragolis)

2024-01-15 Thread Oliveira, Niko
Congrats Andrey!


From: Pankaj Singh 
Sent: Monday, January 15, 2024 11:11:01 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] New PMC member: Andrey 
Anshin (taragolis)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Congrats Andrey 

On Tue, Jan 16, 2024 at 12:30 AM Aritra Basu 
wrote:

> Congrats Andrey!!
>
> --
> Regards,
> Aritra Basu
>
> On Tue, Jan 16, 2024, 12:24 AM Vincent Beck  wrote:
>
> > Congrats Andrey!
> >
> > On 2024/01/15 18:46:32 ambika garg wrote:
> > > Congrats Andrey!!
> > >
> > > On Mon, Jan 15, 2024 at 1:44 PM Amogh Desai 
> > > wrote:
> > >
> > > > Congrats Andrey!!
> > > >
> > > > On Tue, 16 Jan 2024 at 12:08 AM, Hussein Awala 
> > wrote:
> > > >
> > > > > Congratulations Andrey, very well deserved!
> > > > >
> > > > > On Mon 15 Jan 2024 at 19:35, Jarek Potiuk 
> wrote:
> > > > >
> > > > > > I have the pleasure to announce that The Project Management
> > Committee
> > > > > > (PMC) for Apache Airflow has invited Andrey Anshin (taragolis) to
> > > > become
> > > > > > Apache
> > > > > > Airflow PMC Member and we are pleased to announce that he has
> > kindly
> > > > > > accepted it.
> > > > > >
> > > > > > Being a PMC member enables assistance with the management and to
> > guide
> > > > > > the direction of the project.
> > > > > >
> > > > > > Congratulations Andrey, I'd say it's been very well deserved !.
> > > > > >
> > > > > > Regards,
> > > > > > Jarek on behalf of the Apache Airflow PMC
> > > > > >
> > > > >
> > > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>


Re: [DISCUSSION] Enabling `pre-commit.ci` application for Airflow

2024-01-04 Thread Oliveira, Niko
Interesting how 50/50 this one has turned out to be!

I'm personally in favour (+1). The less I have to worry about accidental typos, 
indentation, quoting, etc the better, I can focus on important changes. It will 
also unblock many PRs from contributors that are otherwise mergeable except for 
being stuck on very simple static check failures, which as a maintainer sounds 
very nice (it will solve having to post the regular comment of "please run and 
fix static checks").

And ultimately if the bot does something silly (just as a human can and often 
does) we can catch it in the PR review.


Cheers,
Niko


From: Wei Lee 
Sent: Tuesday, January 2, 2024 5:58:18 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSSION] Enabling 
`pre-commit.ci` application for Airflow

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Same as Amogh. Even though I would like to fix that myself, it would make it 
much easier for those who aren’t familiar with these tools and still be able to 
contribute. But we might need to doc this behavior somewhere (GitHub PR issue 
might make more sense 樂). Otherwise, the contributor might be surprised by the 
new commit.

Best,
Wei

> On Jan 3, 2024, at 12:21 AM, Vincent Beck  wrote:
>
> I like the concept! +1
>
> On 2023/12/30 11:16:35 Amogh Desai wrote:
>> I am aligning here with Pierre, but I am not against the idea of enabling
>> the pre commit ci application.
>>
>> I’d rather have myself fix the issue as it sometimes also lets me have
>> second,third or multiple passes at my code which is up for review. This is
>> a personal choice where I feel that we are trying to fix a problem that is
>> not too problematic.
>>
>> Again, only a personal choice but not against it. If it makes lives of the
>> stakeholders involved easier, I am all up for it!
>>
>> Thanks & Regards,
>> Amogh Desai
>>
>> On Sat, 30 Dec 2023 at 2:35 PM, Pierre Jeambrun 
>> wrote:
>>
>>> I like the idea, but in practice auto fixable static checks are very
>>> obvious to fix and doesn’t require much work.
>>>
>>> On the other hand most of static failure are ‘real issues’ and not auto
>>> fixable, for instance mypy, spelling, sphinx, db session usage etc…. (And
>>> ruff fix is a little aggressive IMO regarding linting).
>>>
>>> So I would say that in practice it solves a painless problem (formatting,
>>> import sorting and other obvious things) and can’t do much about other
>>> issues.
>>>
>>> This is why I am not sure it is worth the confusion for users. (But I am
>>> not against it)
>>>
>>> On Sat 30 Dec 2023 at 09:19, Scheffler Jens (XC-DX/PJ-PACE-E03)
>>>  wrote:
>>>
 I‘d also like to have auto-fixing included in CI. It is a classic pitfall
 and all that can be automated does not need to be a manual burden.
 Though I am not sure whether the plugin is able to use all the custom
 stuff as we also depend during execution on the CI image and docker.
 Besides security things this would be something that needs testing if it
 works.

 TLDR: +1 opinion

 Sent from Outlook for iOS
 
 From: Pankaj Koti 
 Sent: Saturday, December 30, 2023 7:50:10 AM
 To: dev@airflow.apache.org 
 Subject: Re: [DISCUSSION] Enabling `pre-commit.ci` application for
>>> Airflow

 I very much like the concept. We have been using it actively for
>>> Astronomer
 code repositories for 1+ year already and it has helped us greatly
>>> (Thanks
 to Felix Uellendall for introducing this back then )

 On Sat, 30 Dec 2023, 12:10 Jarek Potiuk,  wrote:

> FYI - Just now INFRA rejected the request on the basis of "code write"
> access permissions the app needs.
>
> I'd still love to get feedback though on the concept -  I am not giving
 up
> that easily. We might still get it approved easily. We likely have some
> ways we can get "auto-fixing" working for us.
>
> 1) I believe Github Applications now can use a bit different mechanism
>>> to
> ask for permissions and it can have "required" and "optional"
>>> permissions
> and I believe "Pull request write" should be enough (and I might
>>> attempt
 to
> convince the maintainers of it to adapt it to our needs).
> 2) Also, there is a "Pre-commit Lite Github Action" that we **might**
>>> be
> able to use to achieve a similar effect (with some added complexity to
 our
> `Pull Request Target` workflow.
>
> So I would still love to hear from others :)
>

Re: [Discuss] New Airflow Community Provider: Teradata

2023-12-24 Thread Oliveira, Niko
This is fantastic! I love to see the testing and dashboards being invested in 
from the beginning, never gets old :)

It shows that you really took the time to read the past discussions and 
documentation we have. I won't be able to look at the code in detail until the 
new year, but so far this is a great submission and a +1 from me!

Cheers,
Niko


From: K Mallam, Sunil 
Sent: Sunday, December 24, 2023 7:36:02 AM
To: dev@airflow.apache.org
Cc: Tworkiewicz, Adam; Mishra, Harishankar; Chinthanippu, Satish
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [Discuss] New Airflow Community 
Provider: Teradata


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.


Thank you Jarek and Elad, we appreciate you recognizing the efforts we put in.


Sunil K Mallam
Staff Product Manager – 3rd Party Developer Tools
[A black and orange logo  Description automatically 
generated]
17095 Via Del Campo Ct.
San Diego, CA 92127
teradata.com
This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy, or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.
Please consider the environment before printing.



From: Elad Kalif 
Date: Sunday, 24 December 2023 at 12:30 PM
To: dev@airflow.apache.org 
Cc: Tworkiewicz, Adam , Mishra, Harishankar 
, Chinthanippu, Satish 

Subject: [EXTERNAL] Re: [Discuss] New Airflow Community Provider: Teradata
[CAUTION: External Email]


Hi Sunil,

Proposal looks very solid and I do see the value in having a Teradata
provider.
I agree with Jarek and I am +1 for it.

On Fri, Dec 22, 2023 at 9:27 PM K Mallam, Sunil
 wrote:

> Hi,
>
>
>
> I’m Sunil Mallam, a Staff Product Manager with Teradata, owning the
> Third-Party Connectors and Integrations.
>
>
>
> Here’s our proposal to be a Community Provider, as we see Airflow being a
> key integration for Teradata’s cloud customers in terms of *orchestrating
> data pipelines and ML workflows *-
>
>
>
> *Why Teradata wants to be a community provider?*
>
>- Teradata has over 700 customers (most of them are from Fortune 1000
>list) who run enterprise workloads.
>- Being on *Airflow’s documentation index provides* *better visibility
>for our user*s and they can easily find Teradata’s provider package,
>without Teradata having to put additional effort into promoting it.
>- Also, our customers have confidence in the quality and stability of
>the provider package if it’s managed and maintained by Teradata, while
>being validated/approved by Airflow.
>
>
>
> *How is it a positive for the Airflow project/community and why should it
> be accepted?*
>
>- We have a dedicated team to manage and maintain the provider package.
>- Teradata will take responsibility for -
>   - end-to-end testing efforts, running system tests periodically in
>   their environment, and making the status available to Airflow.
>   - being available and responsive to any communication from the
>   community members.
>- To support our dedication toward contributions, Teradata is a verified
>adapter with dbt
>
> >
>  and we’re
>diligent in matching up to its releases (major and minor).
>
>
>
> *What is developed as part of the initial implementation (phase 1) and
> what’s next (phase 2)?*
>
> *Phase 1 *
>
>1. Building a Hook, Base Operator, and Transfer Operator (Teradata to
>Teradata).
>2. Creating DAGs for a basic use case (data movement).
>3. Creating Tests and match Test Coverage.
>4. Create supporting documentation for users.
>5. Create a dashboard with system tests status history.
>
>
>
> *Phase 2*
>
>1. Building a Transfer Operator (Cloud to Teradata and Teradata to
>Cloud).
>2. Add SSL Support
>3. Enhancements to match Airflow releases.
>4. Enhancements to support Teradata VantageCloud use production use
>cases.
>5. Support specific production use cases for Teradata 

Re: [ANNOUNCE] New committer: Utkarsh Sharma

2023-12-04 Thread Oliveira, Niko
Congrats! Very well deserved!


From: Pankaj Koti 
Sent: Monday, December 4, 2023 11:28:41 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] New committer: Utkarsh 
Sharma

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Great news, many congratulations Utkarsh 拾✌

On Tue, 5 Dec 2023, 00:53 Pierre Jeambrun,  wrote:

> Congratulations!
>
> Le lun. 4 déc. 2023 à 20:18, Andrey Anshin  a
> écrit :
>
> > Congrats! 
> >
> >
> >
> >
> > On Mon, 4 Dec 2023 at 22:45, Amogh Desai 
> wrote:
> >
> > > This is fantastic! Congratulations Utkarsh.
> > >
> > > Well deserved!
> > >
> > > Thanks & Best Regards,
> > > Amogh Desai
> > >
> > >
> > > On Mon, 4 Dec 2023 at 10:56 PM, Pankaj Singh  >
> > > wrote:
> > >
> > > > Congratulations Utkarsh 
> > > >
> > > > On Mon, Dec 4, 2023 at 10:45 PM Aritra Basu <
> aritrabasu1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Amazing Utkarsh, congratulations!
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Aritra Basu
> > > > >
> > > > > On Mon, Dec 4, 2023, 10:27 PM Phani Kumar <
> phani.ku...@astronomer.io
> > > > > .invalid>
> > > > > wrote:
> > > > >
> > > > > > Congratulations Utkarsh ! Well deserved
> > > > > >
> > > > > > On Mon, Dec 4, 2023 at 10:13 PM Jarek Potiuk 
> > > wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > [filling-in for Kaxil who is less available nowadays and
> > travelling
> > > > > > without
> > > > > > > much access to computer]
> > > > > > >
> > > > > > > The Project Management Committee (PMC) for Apache Airflow
> > > > > > > has invited Utkarsh Sharma to become a committer and we are
> > pleased
> > > > > > > to announce that they have accepted.
> > > > > > >
> > > > > > > Utkarsh had been contributing for quite some time and played
> > > > > > > instrumental role in making Airflow ready for the LLM
> revolution
> > > > > > > and recently embarked on the difficult route of improving our
> > > > > > > documentation contributing experience (which is quite a
> > > > > > > challenge to be honest and a brave commitment :) ) and
> > > > > > > I am looking forward to what the future holds with
> > > > > > > Utkarsh becoming the committer,
> > > > > > >
> > > > > > > Congratulations Utkarsh, and welcome onboard! Certainly well
> > > > deserved.
> > > > > > >
> > > > > > > Being a committer enables easier contribution to the
> > > > > > > project since there is no need to go via the patch
> > > > > > > submission process. This should enable better productivity.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Capturing Architectural decisions (ADRS?)

2023-12-04 Thread Oliveira, Niko
I love this idea!

Another option, that I don't think we as a community are very good at, is 
putting the context of the change in the git commit message itself. Those 
messages are already tightly associated into git history and the code itself 
via blame without needing to introduce an new concept for this purpose. Those 
commit messages can be viewed with many existing tools that can browse Git 
blame/log. Some projects like Linux or Git itself have very large commit 
messages with all the context required, a random example I pulled from the 
first page of most recent commits: 
https://github.com/torvalds/linux/commit/d67f39d2b81b6a8259944d2400c1ff4fe283ff72

You can see the git commit messages is MUCH longer than the code change itself 
even! So if you're curious why that code is the way it is, you can just git 
blame it and have all the context there.

The ADR having markdown is nice, it allows you a bit more formatting, but then 
it also requires a couple more steps to view that formatting.

Cheers,
Niko


From: Vincent Beck 
Sent: Monday, December 4, 2023 11:22:54 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Capturing Architectural 
decisions (ADRS?)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



I love this idea. I definitely think it can improve a lot the knowledge sharing 
across Airflow. Given the history and the number of components in Airflow, it 
is hard to keep up with everything, so having these ADRs would help a lot I 
think!

On 2023/12/03 23:57:11 Jarek Potiuk wrote:
> Hey everyone,
>
> I think we had a bit of clash in
> https://github.com/apache/airflow/pull/32319 where both "ideas"
> (serialization and common.sql) had not been sufficiently
> discussed/explained and I hope we can address it by adding (a bit) more
> "whys" to our (developer) documentation.
>
> I think a number of our past decisions and reasoning behind them are often
> staying in the heads of the people who were discussing them, and even if it
> is captured in past discussions, PRs, it's difficult to do "archeology" on
> them and re-process them and understand what we wanted to achieve and why.
> Some of those are big enough to have impact on future PRs etc. while not
> big enough to get to Airflow Improvement Proposals and I think we miss
> a bit of persistent "decision records" for them.
>
> Two cases in question: Serialization and Common.sql API - both of which
> have not been understood well by people involved in one, but not the other
> in the past.
>
> With the "common.sql" PR (https://github.com/apache/airflow/pull/36015) -
> my proposal is to add it in the form of ADR ("Architecture Decision
> Records'  - which is a very simple and lightweight way of storing the
> decisions we made - and evolving them.
>
> ADRs are pretty popular and adopted in mature organisations/projects and
> I've used them in `breeze`
> https://github.com/apache/airflow/tree/main/dev/breeze/doc/adr and I think
> they are perfect for capturing, context, decisions and putting down the
> consequences of some decisions.
>
> They are usually kept close to the code the decision is about, they are
> usually short and describe a single aspect of architectural decision, and
> they are aimed to be read whenever in the future, people who were not
> involved in those decisions can easily recover why the decisions are made
> and what are the consequences of it.
>
> I am not saying - of course - we should do it for all or even most changes
> - I am talking about decisions that have potential impact on others - in
> the future. I.e. when we tell (this is how our approach should look in the
> future for "general" behaviour.
>
> Both - serialization and common.sql are good examples of such decisions -
> that I believe deserve to be captured "why" we are doing them and what we
> wanted to achieve.
>
> WDYT?
>
> J.
>

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [PROPOSE] Airflow Monthly Town-Hall

2023-11-29 Thread Oliveira, Niko
Love this idea!



Jumping on this thread to be able to receive the agenda Briana mentioned (but I 
think there's no harm in just including it here for anyone to read).

Cheers,
Niko



From: Briana Okyere 
Sent: Wednesday, November 29, 2023 9:30:21 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [PROPOSE] Airflow Monthly Town-Hall

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Thank you all! Love all of these suggestions. I took the liberty of putting
together a rough agenda, so I will share with the folks on this thread
separately and we can go from there!

On Wed, Nov 29, 2023 at 12:15 AM Aritra Basu 
wrote:

> This seems like a great idea and a good way to get info out to the
> community. I'd love to be involved in any way I can.
>
> --
> Regards,
> Aritra Basu
>
> On Wed, Nov 29, 2023, 11:49 AM Akash Sharma <2akash111...@gmail.com>
> wrote:
>
> > I would like to be involved
> >
> > Best regards,
> > Akash
> >
> > On Wed, 29 Nov 2023, 02:11 Briana Okyere,
> >  wrote:
> >
> > > Hey All,
> > >
> > > I've been speaking with Kaxil and Jarek about this, and would like to
> > > propose it to you all.
> > >
> > > It seems those involved in contributing to Airflow regularly are very
> > well
> > > synced on the product. However, I think we can do a better job of
> > involving
> > > the community at large in the amazing work being done daily on this
> > > product.
> > >
> > > So, I would like to propose a monthly town hall, where community
> members
> > > can join together virtually each month to receive updates, give
> feedback,
> > > and ideally, see where they can lend a hand.
> > >
> > > If anyone is interested in being involved, please let me know and we
> can
> > > discuss further. Or if you simply have thoughts, I'd love to hear them!
> > >
> > > --
> > > Briana Okyere
> > > Community Manager
> > > Email: briana.oky...@astronomer.io
> > > Mobile: +1 415.713.9943
> > > Time zone: US Pacific UTC
> > >
> > > 
> > >
> >
>


Re: [DISCUSS] Suspend/Remove Apache Scoop provider

2023-11-23 Thread Oliveira, Niko
+1


From: Pierre Jeambrun 
Sent: Thursday, November 23, 2023 10:28:52 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Suspend/Remove Apache 
Scoop provider

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1

Le jeu. 23 nov. 2023 à 17:52, Hussein Awala  a écrit :

> +1
>
> On Thu, Nov 23, 2023 at 11:52 AM Andrey Anshin 
> wrote:
>
> > Greetings everyone!
> >
> > Since we began to actively use the mechanism to suspend/remove providers
> I
> > want to start the discussion about suspend and potential remove Apache
> > Scoop [1] provider.
> >
> > Apache Scoop moved into the Attic in July 2021 [2] due to inactive
> > development [3] and Apache Scoop PMS was terminated.
> > According to the Attic page of Scoop there is no forks of this project
> >
> > Latest releases :
> > Scoop1 (stable): 1.4.7 was released on 2018-01-24
> > Sqoop2: 1.99.7 was released on 2016-08-08
> >
> > Perhaps the fact that the product is no longer being developed might be
> not
> > a reason to exclude it from the Airflow providers.
> > In that case it would be nice to have a discussion which would use in
> > further cases of suspension/removal providers
> >
> > [1] https://sqoop.apache.org/
> > [2] https://attic.apache.org/projects/sqoop.html
> > [3] https://whimsy.apache.org/board/minutes/Sqoop.html
> >
>


Re: [DISCUSS] Suspend (Remove?) Daskexecutor provider

2023-11-16 Thread Oliveira, Niko
If no one comes forward willing to support the executor long term I'm +1 for 
removal.


From: Vincent Beck 
Sent: Thursday, November 16, 2023 10:59:40 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Suspend (Remove?) 
Daskexecutor provider

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 for removal

On 2023/11/16 18:54:15 Jarek Potiuk wrote:
> More detailed comparison:
>
> apache-airflow 2.7.* ~ 255.000 downloads/day
> apache-aurflow-provider-dask-executor ~ 900/ day
>
> This means that *apache-airflow-providers-daskexecutor * is downloaded in
> less of *0.3 %* cases, comparing to *apache-airflow*
>
> I'd say it's negligible usage.
>
> My personal vote would go for immediate removal.
>
> WDYT?
>
> J.
>
>
>
> On Wed, Nov 15, 2023 at 10:39 PM Jarek Potiuk  wrote:
>
> >
> > On Wed, Nov 15, 2023 at 10:20 PM Elad Kalif  wrote:
> >
> >> Now that the code is in its own provider we can check the download stats
> >> of
> >> the library via Pypi stats.
> >>
> >>
> > Good point but this is only an indication.
> >
> > Currently this is only for Airflow 2.7+ and it is a bit difficult to
> > compare those. The "fixed" numbers are misleading (we still do not know
> > where the 400.000 downloads/day jump to almost 1Million / day came from
> > mid-October.
> > The ratio is now 1000/day (daskexecutor): 800.000 day (airflow) ->  ~0.1
> > %. But those are only downloads - and not whether it's used - likely there
> > are a number of cases where CI pulls "all" extras.
> >
> > J.
> >
>

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [ANNOUNCE] New committer: Jens Scheffler

2023-11-07 Thread Oliveira, Niko
Congrats Jens!!


From: Briana Okyere 
Sent: Tuesday, November 7, 2023 1:00:35 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [ANNOUNCE] New committer: Jens 
Scheffler

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Congratulations Jens!

On Tue, Nov 7, 2023 at 12:59 PM Andrey Anshin 
wrote:

> Congrats Jens!  
>
> 
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Tue, 7 Nov 2023 at 23:33, Utkarsh Sharma
>  wrote:
>
> > Congratulations Jens! :)
> >
> > Thanks,
> > Utkarsh Sharma
> >
> > On Wed, Nov 8, 2023 at 1:01 AM Vincent Beck  wrote:
> > >
> > > Welcome onboard Jens!
> > >
> > > On 2023/11/07 19:24:04 Jarek Potiuk wrote:
> > > > The Project Management Committee (PMC) for Apache Airflow
> > > > has invited Jens Scheffler to become a committer and we are pleased
> > > > to announce that they have accepted.
> > > >
> > > > Jens has been contributing for a number of months
> > > > he also participated a lot in discussions and decisions
> > > > on many aspects of Airflow but also helps a lot
> > > > our users and contributors on Slack, Github, Discussions
> > > >
> > > > I am looking forward to what the future holds with Jens
> > > > becoming the committer,
> > > >
> > > > Congratulations Jens, and welcome onboard!
> > > >
> > > > Being a committer enables easier contribution to the
> > > > project since there is no need to go via the patch
> > > > submission process. This should enable better productivity.
> > > > A PMC member helps manage and guide the direction of the project.
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Oliveira, Niko
+1 (binding)

looking forward to having more native LLM capabilities in Airflow!


From: Aritra Basu 
Sent: Wednesday, October 25, 2023 12:10:00 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for Pinecone, 
OpenAI & Cohere to enable first-class LLMOps

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (non binding)

--
Regards,
Aritra Basu

On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis 
wrote:

> +1 (binding)
>
>
>  - ferruzzi
>
>
> 
> From: Jed Cunningham 
> Sent: Wednesday, October 25, 2023 9:54 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> Pinecone, OpenAI & Cohere to enable first-class LLMOps
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> +1 (binding)
>


Re: The "no_status" state

2023-10-16 Thread Oliveira, Niko
I really like this idea as well! One of the _the most common_ questions I get 
from people managing an Airflow env is "Why is my task stuck in state X". 
Anything we can do to make that more discoverable and user friendly, especially 
in the UI instead of (or in addition to) logs would be fantastic!

Thanks to Jens for having a think and pointing out a lot of the implications, I 
agree a quick AIP might be nice for this one.

Cheers,
Niko


From: Scheffler Jens (XC-DX/ETV5) 
Sent: Thursday, September 28, 2023 10:36:00 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] The "no_status" state

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hi Ryan,

I really like the idea of exposing some more scheduler details. More 
transparency in scheduling also in the UI would help the user in (1) seeing and 
understanding what is going on and (2) reduces the need to crawl for logs and 
raise support tickets if status is “strange”. I often also see this as a 
problem. This is also sometimes generating a bit of “mis trust” in the 
scheduler stability.

From point of scheduler “overhead” I assume as long as we are not making a 
“full scan” just to ensure that each and every task is always up-to-date 
(Scheduler stops processing today after enough tasks have been processes in a 
loop or if scheduling limits are reached) this is OK for me and on the code 
side does not seem to be much overhead.
I have a bit of fear on the other hand that very many frequent updates need to 
happen on the DB as another state would need to be written. So more DB round 
trips are needed. This might hit performance for large DAGs or cases where DAGs 
are scheduled. So at least it would need to filter to update the state to DB 
only if changed to keep performance impact minimal.

From point of naming I still think “no status” is good to indicate that 
scheduler did not digest anything, maybe task was never looked at because 
scheduler actually is really stuck or too busy getting there. I would propose 
if scheduler passes along a task and decides that it is not ready to schedule 
to have an additional state calling e.g. “not_ready” in the state model between 
“none” and “scheduled”.

Finally on the other hand, adding another state in the model, I am not sure 
whether this 100% will help in the use case described by you. Still you might 
need to scratch your head a while if taking a look on UI that a DAG is “stuck” 
until you realize all the options you have configured. Exposing a “why is 
stuck” in a user friendly manner might be another level of complexity in this 
case.

As the state model might touch a lot of code and there might be a longer 
discussion needed, would it be a need to raise an AIP for this? There might be 
a lot more (external, provider??) dependencies adjusting the state model?

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Deterministik open Loop (XC-DX/ETV5)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | 
jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; Geschäftsführung: 
Dr. Stefan Hartung,
Dr. Christian Fischer, Dr. Markus Forschner, Stefan Grosch, Dr. Markus Heyn, 
Dr. Tanja Rückert
​
From: Ryan Hatter 
Sent: Donnerstag, 28. September 2023 23:59
To: dev@airflow.apache.org
Subject: The "no_status" state

Over the last couple weeks I've come across a rather tricky problem a few 
times. One DAG run gets "stuck" in the queued state, while subsequent DAG runs 
will be stuck running (screenshot below). One of these issues was caused by 
`max_active_runs` being met when a task instance from a previously run DAG was 
cleared, and one of the tasks had `depends_on_past=True`. This caused the DAG 
run to be stuck in queued in perpetuity until it was realized that the task 
that wasn't getting scheduled needed the failed task in the preceding DAG run 
to be re-run, which in turn causes the stuck running DAG runs to be stuck in 
running. which caused quite a bit of confusion and stress.

Given that Airflow is pretty burnt out on task instance states and colors, I 
propose replacing "no_status" with "dependencies_not_met" and surfacing 
dependencies in the grid view instead of forcing users to already know where to 
look (i.e. "more details" task instance details). Now that I typed it out, I'm 
not sure there should be a 

Re: [DISCUSS] Executors docs should be published in Airflow core or providers?

2023-09-11 Thread Oliveira, Niko
+1 to this!

I also have a docs section half written on the executor interface and how to 
extend it. But I've been very busy with a few other items that are completing 
soon.

Cheers,
Niko


From: Pankaj Koti 
Sent: Friday, September 8, 2023 12:03:35 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Executors docs should be 
published in Airflow core or providers?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 to the proposal.

I think core Airflow docs can contain details about the default executor
that
gets shipped with standalone Airflow installation and a short note about
possibilities of using other (providers) executors in production and saying
to look
for detailed docs in the corresponding provider.

Regards,



Pankaj Koti

*Senior Software Engineer, *OSS Engineering Team.
Location: Pune, India

Timezone: Indian Standard Time (IST)

Email: pankaj.k...@astronomer.io

Mobile: +91 9730079985


On Fri, Sep 8, 2023 at 11:28 PM Hussein Awala  wrote:

> Since we moved the executors to the providers packages and made the
> executor interface pluggable and extensible, we should move the docs to
> their corresponding providers. However, we need to keep a doc in Airflow
> core that explains how to use/configure a provider executor (as we have for
> the secret managers and the task log handlers) and maybe how to create a
> new custom one.
>
> On Fri, Sep 8, 2023 at 6:15 PM Elad Kalif  wrote:
>
> > Hello everyone,
> >
> > This thread is opened due to open issue Migrate Celery/Dask/Kubernetes
> > Executor docs to providers <
> https://github.com/apache/airflow/issues/33916
> > >
> >
> > *Background:*
> > We had a discussion about extracting Celery, Kubernetes, Dask executors
> > from core to providers (discussion thread
> > , vote
> > thread  >)
> >
> > One of the things we voted on was:
> >
> > Also, resulting from the discussion we will keep documentation for
> > > available executors in Airflow (so they will still be considered as THE
> > > executors available and will be discoverable in the same way as today).
> >
> >
> > *The problem:*
> > Airflow Core and Providers do not share the same release cycle nor
> cadance.
> > This means that if we add new capabilities to executors or fix an issue
> > which requires both code and doc update - the code will be delivered
> > before/ahead of documentation. Both cases are not good.
> >
> >
> > *My proposal:*Now that Celery, Kubernetes, Dask executors are in
> providers.
> > The section of core-concept
> > <
> >
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/index.html
> > >
> > should
> > contain only HIGH level information about the executors. It should not
> > contain information about executors internals or how to address common
> > problems. This information should be in the provider docs. The high level
> > info should be short and on a level that is relevant for all executors
> > which means it's not likely to have many changes over time. The
> > core-concept should have links/refers for deep dive read to the provider
> > docs. This is very similar to what we do with Notifiers
> > <
> >
> https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/notifications.html
> > >
> > core
> > contains high level information and a list of notifers that are linked to
> > provider docs.
> >
> > WDYT?
> >
> > Also, resulting from the discussion we will keep documentation for
> > available executors in Airflow (so they will still be considered as THE
> > executors available and will be discoverable in the same way as
> today).Moe
> > K8S and Celery Executors (and related) to respective providers?
> >
>


Re: [DISCUSS] move from semver to a more "rolling" release cycle for core

2023-08-29 Thread Oliveira, Niko

I'd prefer we stick with semver. As discussed already, there is a little 
friction with each approach, and it's who that friction lands on that's 
important. If we moved to a more time based breaking change approach then it 
reduces our frustration but shifts it over to our users. Whereas right now we 
as developers take on a bit more frustration so that our users can confidently 
upgrade and manage Airflow. I think this is the right thing, a user focused 
approach, it's up to us to make Airflow as easy to use and upgrade (within 
limits of course, but I don't think we've crossed that, yet).

Cheers,
Niko



From: Daniel Standish 
Sent: Sunday, August 27, 2023 9:04:51 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [DISCUSS] move from semver to a more "rolling" release 
cycle for core

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



>
> Since I was called :)


As though you needed to be called to chime in ;)

Yeah and the other thing that your comments made me think of was... how it
could make provider management more challenging.  Because though currently
we have min_airflow_version set in providers and we can use that to control
behavior (and assumptions about what's in core), presently it's just about
future compat and just addition of new features.  But with a change like
this, it would expand that burden to some extent, by
requiring consideration of what's changed and what's removed, in a way that
is not a practical issue presently.

I see no particular reason for removing features if they do not slow us down


Yeah so wholesale removal of features is one thing, like with the subdags
you mentioned.  But the prospect of the infinitely distant 3.0 also has a
more diffuse impact on development. I'm sure many good ideas have emerged
but been ruled out solely based on backcompat.  Sometimes probably on a
narrow backcompat concern where it's maybe like... is anybody really
relying on this aspect of behavior?

Maybe that's simply what we must deal with.  But the thought occurred to
me, maybe there's some other way.

And yeah i shouldn't say it's "not working for us"... that's just me
writing an email 2 minutes before bedtime when an idea popped in my
head obviously it's working ok for us, and doing a lot of work *for* us.




On Sun, Aug 27, 2023 at 1:33 AM Jarek Potiuk  wrote:

> Since I was called :).
>
> Yes. I would be very, very careful here. You might think that we use
> "SemVer" as a "cult". Finally it's  just a versioning scheme we adopted,
> right?  But for me -  this is way more. It's all about communication with
> our users, making promises to our users and design decisions that impact
> our security policies.
>
> I think Semver has this nice property that we can promise our users "if you
> are using the public interface of Airflow, you can upgrade without too much
> of a fear that things will break - if they will be broken, this will be
> accidental and will get fixed".  BTW we already have, very nicely defined
>
> https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html
> so it's pretty clear what we promise to our users. And it also has certain
> "security" properties - but I will get to that.
>
> I would love to hear what other think,  but I have 3 important aspects that
> should be considered in this discussion
>
> 1. Promises we make to our users and what it means for us unswering to
> their issues.
>
> Surely we could make other promises. CalVer promises (We release regularly)
> - but it does not give the user any indication that whatever worked before
> will work in the foreseeable future and will get maintained. It makes
> maintainer life easier, yes. It however makes the user's life harder,
> because they cannot rely on something being available and their
> upgrades might and will be more difficult. And yes - for Snowflake it
> matters a lot, because they actually get paid for supporting old versions
> and they have no choice but to respond to users who claim the "old
> supported version does not work". They cannot (as we can, and often do
> currently) tell the users "upgrade to the latest  version - it should be
> easy because of SemVer promise - if you follow our "use public interface
> only of course".  We (community/maintainer) can very easily say that and
> since we give no support, no guarantees, no-one pays for solving problems,
> this "upgrade to latest version" is actually a very good answer - many,
> many times.
>
> For maintainers that rarely respond to user questions, yes Semver is harder
> to add new things. But for maintainers who actually respond a lot to users'
> questions, life is suddenly way harder - because they cannot answer
> "upgrade to latest version" - because immediately the user will answer "but
> I cannot - because I am using this and that feature. tell me how to 

Re: [DISCUSS] Preventing users from misusing _PIP_ADDITIONAL_REQUIREMENTS ?

2023-08-29 Thread Oliveira, Niko
I'd vote for a period of time with warnings (either in the logs and/or in the 
Airflow UI), as a deprecation warning of sorts. Followed by removing the 
feature later on, unless we find that the warnings are enough to lower the 
operational load this causes us, but I think that's unlikely.

Cheers,
Niko


From: Jed Cunningham 
Sent: Tuesday, August 29, 2023 10:05:01 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [DISCUSS] Preventing users from misusing 
_PIP_ADDITIONAL_REQUIREMENTS ?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



I also don't like the 10 minute thing. I'd rather we remove it, or display
a message like we do sequential executor (we can only do so much, this is
as visible as we can make it really), I think in that order?


Re: [VOTE] Drop MsSQL as supported backend

2023-08-28 Thread Oliveira, Niko
+1 (binding)


From: Jed Cunningham 
Sent: Monday, August 28, 2023 9:32:43 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [VOTE] Drop MsSQL as supported backend

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 binding


Re: [VOTE] Airflow Providers prepared on August 11, 2023

2023-08-11 Thread Oliveira, Niko
+1 (non-binding)

Checked my change as well as ran an AWS System test suite on the release 
candidate, all green: 
https://aws-mwaa.github.io/open-source/system-tests/version/b5a4d36383c4143f46e168b8b7a4ba2dc7c54076_8.5.1rc1.html


Cheers,
Niko


From: Vincent Beck 
Sent: Friday, August 11, 2023 9:12:59 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [VOTE] Airflow Providers prepared on August 11, 2023

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 non-binding. I tested successfully the Amazon provider package using my 
testing DAGs. I also tested my changes and it looks good.

On 2023/08/11 12:29:48 Utkarsh Sharma wrote:
> +1 non-biding
>
> Thanks,
> Utkarsh
>
>
> On Fri, Aug 11, 2023 at 5:48 PM Pankaj Koti
>  wrote:
> >
> > +1 (non-binding)
> >
> > Concur with Wei & Rahul.
> >
> > Regards,
> >
> >
> >
> > Pankaj Koti
> >
> > *Senior Software Engineer, *OSS Engineering Team.
> > Location: Pune, India
> >
> > Timezone: Indian Standard Time (IST)
> >
> > Email: pankaj.k...@astronomer.io
> >
> > Mobile: +91 9730079985
> >
> >
> > On Fri, Aug 11, 2023 at 5:45 PM Jarek Potiuk  wrote:
> >
> > > +1 (binding). Checked licences, checksums, signatures, sources -> all 
> > > looks
> > > good. Looked at my change for this release: the updated appflow dependency
> > > is there.
> > >
> > > On Fri, Aug 11, 2023 at 1:49 PM rahul Vats  wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Tested our example DAGS successfully for below providers
> > > >
> > > > apache-airflow-providers-amazon==8.5.1rc1
> > > > apache-airflow-providers-cncf-kubernetes==7.4.2rc1
> > > > apache-airflow-providers-databricks==4.3.3rc1
> > > > apache-airflow-providers-snowflake==4.4.2rc1
> > > > apache-airflow-providers-microsoft-azure==6.2.4rc1
> > > >
> > > > Regards,
> > > > Rahul Vats
> > > > 9953794332
> > > >
> > > >
> > > > On Fri, 11 Aug 2023 at 16:39, Wei Lee  wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Tested with our example DAGs with the following providers without
> > > > > encountering issues
> > > > >
> > > > > apache-airflow-providers-amazon==8.5.1rc1
> > > > > apache-airflow-providers-cncf-kubernetes==7.4.2rc1
> > > > > apache-airflow-providers-databricks==4.3.3rc1
> > > > > apache-airflow-providers-snowflake==4.4.2rc1
> > > > > apache-airflow-providers-microsoft-azure==6.2.4rc1
> > > > >
> > > > >
> > > > > Best,
> > > > > Wei
> > > > >
> > > > >
> > > > > > On Aug 11, 2023, at 12:51 PM, Elad Kalif  wrote:
> > > > > >
> > > > > > Hey all,
> > > > > >
> > > > > >
> > > > > > I have just cut the new wave Airflow Providers packages. This email
> > > is
> > > > > > calling a vote on the release,
> > > > > >
> > > > > > which will last for 72 hours - which means that it will end on 
> > > > > > August
> > > > 14,
> > > > > > 2023 04:47 AM UTC and until 3 binding +1 votes have been received.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Consider this my (binding) +1.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Airflow Providers are available at:
> > > > > >
> > > > > > https://dist.apache.org/repos/dist/dev/airflow/providers/
> > > > > >
> > > > > >
> > > > > > *apache-airflow-providers--*.tar.gz* are the binary
> > > > > >
> > > > > > Python "sdist" release - they are also official "sources" for the
> > > > > provider
> > > > > > packages.
> > > > > >
> > > > > >
> > > > > > *apache_airflow_providers_-*.whl are the binary
> > > > > >
> > > > > > Python "wheel" release.
> > > > > >
> > > > > >
> > > > > > The test procedure for PMC members is described in
> > > > > >
> > > > > >
> > > > >
> > > >
> > > https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-candidate-by-pmc-members
> > > > > >
> > > > > >
> > > > > > The test procedure for and Contributors who would like to test this
> > > RC
> > > > is
> > > > > > described in:
> > > > > >
> > > > > >
> > > > >
> > > >
> > > https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-candidate-by-contributors
> > > > > >
> > > > > >
> > > > > >
> > > > > > Public keys are available at:
> > > > > >
> > > > > > https://dist.apache.org/repos/dist/release/airflow/KEYS
> > > > > >
> > > > > >
> > > > > > Please vote accordingly:
> > > > > >
> > > > > >
> > > > > > [ ] +1 approve
> > > > > >
> > > > > > [ ] +0 no opinion
> > > > > >
> > > > > > [ ] -1 disapprove with the reason
> > > > > >
> > > > > >
> > > > > >
> > > > > > Only votes from PMC members are binding, but members of the 
> > > > > > community
> > > > are
> > > > > >
> > > > > > encouraged to test the release and vote with "(non-binding)".
> > > > > >
> > > > > >
> > > > > > Please note that the version number excludes the 'rcX' string.
> > > > > >
> > > > > > This will allow us to rename the artifact 

Re: [VOTE] Airflow Providers prepared on July 29, 2023

2023-08-01 Thread Oliveira, Niko
Tested the RC1 against the Amazon system tests and the results can be viewed 
here: 
https://aws-mwaa.github.io/open-source/system-tests/version/ddcd30e7c7f5daeab5f74fb3224a4d5e33cec95d_8.4.0rc1.html

I would still like to do some more testing around executors.

P.S.
I think the Celery issue Vikram is mentioning is this one: 
https://github.com/apache/airflow/issues/32973


From: Elad Kalif 
Sent: Monday, July 31, 2023 11:21:04 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [VOTE] Airflow Providers prepared on July 29, 2023

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi Vikram
good to hear from you.
Can you be more specific? I didn't see any thread/report about Celery issue
since RC1 was cut

On Tue, Aug 1, 2023 at 8:43 AM Vikram Koka 
wrote:

> Not trying to hold up the release, but I thought there was a "race
> condition" bug discovered with the Celery executor as a provider and
> Airflow main.
> Is that resolved now?
>
> Or, did I mistake the origin of that?
>
> On Mon, Jul 31, 2023 at 8:19 AM Beck, Vincent  >
> wrote:
>
> > +1 (non-binding). I tested the Amazon provider package against my testing
> > DAGs.
> >
> > On 2023-07-31, 11:03 AM, "Pankaj Singh"  > > wrote:
> >
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> >
> >
> >
> > +1 (non binding)
> >
> >
> > On Mon, Jul 31, 2023 at 4:34 PM Ephraim Anierobi <
> > ephraimanier...@gmail.com >
> > wrote:
> >
> >
> > > +1 binding. Checked files, signatures & licenses
> > >
> > > On Mon, 31 Jul 2023 at 08:46, Amogh Desai  > >
> > > wrote:
> > >
> > > > +1 (non binding)
> > > > Tested out cncf provider:
> > > >
> > >
> >
> https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/7.4.0rc1
> > <
> >
> https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/7.4.0rc1
> > >
> > > > with
> > > > few examples DAGs. No regressions.
> > > >
> > > > Thanks & Regards,
> > > > Amogh Desai
> > > >
> > > > On Sun, Jul 30, 2023 at 9:56 PM utkarsh sharma <
> utkarshar...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > +1 non binding. Tested apprise provider
> > > > >
> https://pypi.org/project/apache-airflow-providers-apprise/1.0.1rc1/
> > 
> > > > >
> > > > > On Sun, 30 Jul 2023 at 9:42 PM, Phani Kumar
> > > > >  > phani.ku...@astronomer.io.inva>lid> wrote:
> > > > >
> > > > > > +1 non binding
> > > > > >
> > > > > > On Sun, 30 Jul 2023, 20:54 Hussein Awala,  > > wrote:
> > > > > >
> > > > > > > +1 (binding) I tested my change and I checked the signatures,
> the
> > > > > > checksums
> > > > > > > and the source code.
> > > > > > >
> > > > > > > On Sun, Jul 30, 2023 at 4:18 PM rahul Vats <
> > rah.sharm...@gmail.com 
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > I have tested our example DAGS with the below providers and
> > they
> > > > are
> > > > > > > > working perfectly fine!
> > > > > > > >
> > > > > > > > 1. apache-airflow-providers-amazon==8.4.0rc1
> > > > > > > > 2. apache-airflow-providers-apache-hive==6.1.3rc1
> > > > > > > > 3. apache-airflow-providers-cncf-kubernetes==7.4.0rc1
> > > > > > > > 4. apache-airflow-providers-databricks==4.3.2rc1
> > > > > > > > 5. apache-airflow-providers-google==10.5.0rc1
> > > > > > > > 6. apache-airflow-providers-microsoft-azure==6.2.2rc1
> > > > > > > > 7. apache-airflow-providers-sftp==4.5.0rc1
> > > > > > > > 8. apache-airflow-providers-snowflake==4.4.0rc1
> > > > > > > >
> > > > > > > > Also, followed https://github.com/apache/airflow/pull/32948
> > and 
> > > > > > tested
> > > > > > > > Celery Executor with Breeze, this is very helpful Thank you.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Rahul Vats
> > > > > > > > 9953794332
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, 30 Jul 2023 at 17:28, Jarek Potiuk  > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > +1 (binding) - tested signatures, checksums, licences,
> source
> > > > code
> > > > > > for
> > > > > > > > > both sdist and wheel packages. Checked all my changes -
> they
> > > are
> > > > > in.
> > > > > > > > >
> > > > > > > > > I also used Breeze capabilities to build the "main" airflow
> > > > > > (upcoming
> > > > > > > > > 2.7.0) and ran it alongside new providers (with Celery
> > > Executor)
> > > > -
> > > > > > > > > looks like the Celery Executor move works flawlessly.
> > > > > > > > >
> > > > > > > > > FYI: 

Re: [ANNOUNCE] New PMC member: Hussein Awala

2023-07-31 Thread Oliveira, Niko
Congrats Hussein!


From: Beck, Vincent 
Sent: Monday, July 31, 2023 7:32:08 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [ANNOUNCE] New PMC member: Hussein Awala

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congratulations Hussein!

On 2023-07-30, 5:41 AM, "utkarsh sharma" mailto:utkarshar...@gmail.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Congratulations well deserved :)


On Sun, 30 Jul 2023 at 3:08 PM, Pankaj Singh mailto:ags.pankaj1...@gmail.com>>
wrote:


> Congrats Hussein 
>
> On Sun, Jul 30, 2023 at 12:14 PM Ankit Chaurasia  >
> wrote:
>
> > Congratulations
> >
> > On Sun, 30 Jul 2023 at 11:43, Phani Kumar  > 
> > .invalid>
> > wrote:
> >
> > > Congrats Hussein
> > >
> > > On Sun, 30 Jul 2023, 10:49 Elad Kalif,  > > > wrote:
> > >
> > > > Congrats! well deserved!
> > > >
> > > > On Sun, Jul 30, 2023 at 5:36 AM Wei Lee  > > > > wrote:
> > > >
> > > > > Congratulations! 
> > > > >
> > > > > Best,
> > > > > Wei
> > > > >
> > > > > > On Jul 30, 2023, at 1:39 AM, Pankaj Koti <
> > pankaj.k...@astronomer.io 
> > > > .INVALID>
> > > > > wrote:
> > > > > >
> > > > > > Many congratulations, Hussein   
> > > > > >
> > > > > > On Sat, 29 Jul 2023, 23:06 Amogh Desai, <
> amoghdesai@gmail.com >
> > > > > wrote:
> > > > > >
> > > > > >> Awesome news! Good going, Hussein!
> > > > > >>
> > > > > >> Congratulations 
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Amogh Desai
> > > > > >>
> > > > > >> On Sat, Jul 29, 2023, 23:04 Jarek Potiuk  > > > > >> >
> > wrote:
> > > > > >>
> > > > > >>> Hello Airflow Community,
> > > > > >>>
> > > > > >>> I have the pleasure to announce that The Project Management
> > > Committee
> > > > > >>> (PMC) for Apache Airflow has invited Hussein Awala to become
> > Apache
> > > > > >>> Airflow PMC Member and we are pleased to announce that he has
> > > kindly
> > > > > >>> accepted it.
> > > > > >>>
> > > > > >>> Being a PMC member enables assistance with the management and
> to
> > > > guide
> > > > > >>> the direction of the project.
> > > > > >>>
> > > > > >>> Congratulations Hussein, It's been very well deserved !.
> > > > > >>>
> > > > > >>> Regards,
> > > > > >>> Jarek on behalf of the Apache Airflow PMC
> > > > > >>>
> > > > > >>>
> > > -
> > > > > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org 
> > > > > >>> 
> > > > > >>> For additional commands, e-mail: dev-h...@airflow.apache.org 
> > > > > >>> 
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > >
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org 
> > > > > 
> > > > > For additional commands, e-mail: dev-h...@airflow.apache.org 
> > > > > 
> > > > >
> > > > >
> > > >
> > >
> > --
> > *Ankit Chaurasia*
> > HomePage   
> > | LinkedIn
> >  
> >  | +91-9987351649
> >
>




-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org


Re: [VOTE] AIP-57 Refactor SLA Feature

2023-07-24 Thread Oliveira, Niko
Looking forward to this one! I think the new behaviours will be much better 
than we have now.

+1 (binding)




From: Utkarsh Sharma 
Sent: Thursday, July 20, 2023 11:45:57 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-57 Refactor SLA Feature

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 (non binding)

On Thu, Jul 20, 2023 at 11:57 PM Jarek Potiuk  wrote:
>
> +1 (binding).
>
> On Mon, Jul 17, 2023 at 7:56 PM Damian Shaw 
> wrote:
>
> > +1 (Non-binding)
> >
> > Although I would like to see some version of the proposed "SlaTask"
> > workaround included in Airflow itself so that:
> > 1) Tests can be added to make sure other Airflow changes don't
> > break assumptions about how it would work, even if people mostly implement
> > their own version, they can use this one as a base and know it works
> > 2) People can start migrating away from the current SLA mechanism
> > as soon as possible
> >
> > Damian
> >
> > -Original Message-
> > From: Sung Yun 
> > Sent: Monday, July 17, 2023 1:09 PM
> > To: dev@airflow.apache.org
> > Subject: [VOTE] AIP-57 Refactor SLA Feature
> >
> > Dear Airflow community,
> >
> > I would like to start a vote for "AIP-57 Refactor SLA Feature"
> >
> > You can find the AIP here:
> >
> > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-57+Refactored+SLA+Feature
> >
> > Discussion threads:
> > https://lists.apache.org/thread/kn1rb9l70jk16j83fycp68l2g70wnhzl
> >
> > https://lists.apache.org/thread/kj33fwd9otbbm52my9j3y39rp547g0n5
> >
> > https://lists.apache.org/thread/l2ho7w18d11vcnon7dxoxghzxhtjw5hf
> >
> >
> > This is my (non-binding) +1, the vote will last until 1pm PST July 24th,
> > and until at least 3 binding votes have been cast.
> >
> >
> > Please vote accordingly:
> >
> > [ ] + 1 approve
> > [ ] + 0 no opinion
> > [ ] - 1 disapprove with the reason
> >
> > Only votes from PMC members and committers are binding, but other members
> > of the community are encouraged to check the AIP and vote with
> > "(non-binding)".
> > 
> >  Strike Technologies, LLC (“Strike”) is part of the GTS family of
> > companies. Strike is a technology solutions provider, and is not a broker
> > or dealer and does not transact any securities related business directly
> > whatsoever. This communication is the property of Strike and its
> > affiliates, and does not constitute an offer to sell or the solicitation of
> > an offer to buy any security in any jurisdiction. It is intended only for
> > the person to whom it is addressed and may contain information that is
> > privileged, confidential, or otherwise protected from disclosure.
> > Distribution or copying of this communication, or the information contained
> > herein, by anyone other than the intended recipient is prohibited. If you
> > have received this communication in error, please immediately notify Strike
> > at i...@striketechnologies.com, and delete and destroy any copies hereof.
> > 
> >
> > CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any attachments
> > are intended solely for the addressee. This transmission is covered by the
> > Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The
> > information contained in this transmission is confidential in nature and
> > protected from further use or disclosure under U.S. Pub. L. 106-102, 113
> > U.S. Stat. 1338 (1999), and may be subject to attorney-client or other
> > legal privilege. Your use or disclosure of this information for any purpose
> > other than that intended by its transmittal is strictly prohibited, and may
> > subject you to fines and/or penalties under federal and state law. If you
> > are not the intended recipient of this transmission, please DESTROY ALL
> > COPIES RECEIVED and confirm destruction to the sender via return
> > transmittal.
> >

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [DISCUSS] Should we pre-install celery/k8s providers?

2023-07-21 Thread Oliveira, Niko
As I've said before I believe not pre-installing celery and k8s executors 
should be the ultimate goal, so I totally agree with this, but we need to do it 
in a way that minimizes impact. It's hard to catch every angle possible with 
these sorts of things (i.e. Daniel's point of folks installing required deps 
outside of extras, I could see this being done pretty commonly).

What do folks think about doing this in stages? I don't see a reason why we 
need to make all the changes at once. We could release the executors in 
providers and have them come pre-installled for some number of releases. This 
would allow the majority of folks to slowly migrate to newer versions of both 
core and providers which would then minimize the blast radius when we 
eventually turn off the pre-installation of those providers.


From: Daniel Standish 
Sent: Friday, July 21, 2023 10:07:44 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSS] Should we pre-install celery/k8s providers?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



* *provider* extras or provider optional features


Re: [DISCUSS] Contributing "core" options by providers configuration ?

2023-07-21 Thread Oliveira, Niko
I agree with Jed. I don't actually mind allowing contribution of any 
config/section but I think any conflicts discovered should fail very loudly.


From: Jed Cunningham 
Sent: Friday, July 21, 2023 7:45:02 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSS] Contributing "core" options by providers 
configuration ?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



I don't necessarily mind providers adding options to existing sections, but
I can see the appeal and simplicity of enforcing it at the section level.

 Either way, I think in the case of multiple, it should fail loudly, not
just warn.


Re: [VOTE] Make (soon coming) dask provider preinstalled

2023-07-21 Thread Oliveira, Niko
-1 (binding)

I think the eventual goal is all 3rd party executors (excluding Local, 
Sequential, etc) are not pre-installed. I think it will take a while for us to 
get there with Celery and K8s but it's the right thing to shoot for and we 
should start with Dask.


From: Collin McNulty 
Sent: Friday, July 21, 2023 8:37:38 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Make (soon coming) dask provider preinstalled

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



-1 (non-binding)
I agree with Ash’s reasoning.
--

Collin McNulty
Director of Global Support

Email: col...@astronomer.io 
Time zone: US Central (CST UTC-6 / CDT UTC-5)





Re: [DISCUSS] Moving Dask Executor to a separate (optional?) dask provider

2023-07-12 Thread Oliveira, Niko
I think in a perfect world we'd only have the completely vendor neutral 
executors pre-installed (Local, Sequential, Debug) and anything else would need 
to be specifically installed by admins/users. I think if we were starting from 
scratch this would make the most sense, but clearly Kubernetes and Celery 
executors are so ubiquitous that it'd cause too much wreckage to not install 
them, but I'd like to push for Dask to _not_ be installed by default. If this 
causes too much wreckage then perhaps we should deprecate that (though I'm not 
sure exactly what that would look like in this context), but it's difficult to 
measure how many folks are using the Dask executor. Perhaps we have data from 
the yearly questionnaire/survey we send?


From: Jarek Potiuk 
Sent: Wednesday, July 12, 2023 8:05:54 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [DISCUSS] Moving Dask Executor to a separate (optional?) 
dask provider

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hello Everyone,

A small follow up after K8S/Celery executors being moved:
https://lists.apache.org/thread/7gyw7ty9vm0pokjxq7y3b1zw6mrlxfm8

We are in the process of moving Celery / Kubernetes executor (Celery almost
complete and I am working on K8S next + some common discovery and config
moving)

But there is one more "questionable" executor - i.e. Dask executor, still
living in Airflow Core.

When it comes to Celery/Kubernetes, we decided to make the two providers
preinstalled, because it makes most sense  - we are also going to get the
basic documentation in the "core" airflow documentation so that it is
easier discoverable and prominently visible - also because of the
vendor-neutrality.

However when it comes to Dask I am not sure about its status and whether we
should make it preinstalled ?

I guess there is no doubt to move it to a provider - this has only the
benefits same as Celery/K8S move. But whether it should be preinstalled
with Airflow - I am not sure. I do not know how frequently Dask executor
(and Dask) is used by people using Airflow, but I personally do not think
it should be as "closely" connected with Airflow as Celery/Kubernetes ones.

If we do not make it preinstalled, it is somewhat (but not too much,
really) breaking change. We still might choose to install dask provider in
the PROD reference image, so it will continue to work if you use the image,
and when you are installing airflow in venv you will only have to specify
`pip install apache-airflow[dask]` or manually install
`apache-airflow-providers-daskexecutor` (for now at least this is the name
I could reserve in PyPI). So this is not really breaking, it just requires
another dependency to be installed. But some pipelines of installing
Airflow might get broken because it won't be pre-installed - so this is a
borderline breaking.

WDYT? Should we make the dask executor pre-installed or not?

J.


Re: [ANNOUNCE] New committers: Vincent Beck, Phani Kumar, Maciej Obuchowski

2023-06-28 Thread Oliveira, Niko
Congrats all! Very well deserved!


From: Wei Lee 
Sent: Tuesday, June 27, 2023 11:29:10 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][ANNOUNCE] New committers: Vincent Beck, Phani Kumar, 
Maciej Obuchowski

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congrats!!!

Best,
Wei

> On Jun 28, 2023, at 2:26 PM, Pankaj Singh  wrote:
>
> Congratulations!
>
> On Wed, Jun 28, 2023 at 11:53 AM Amogh Desai 
> wrote:
>
>> Wow! Congrats!!
>>
>>
>>
>> On Wed, Jun 28, 2023, 11:48 Pankaj Koti > .invalid>
>> wrote:
>>
>>> Amazing news at the start of the day!
>>>
>>> Very well deserved all of them, Vincent, Phani and Maciej!!
>>> Congratulations! Keep helping us and the community :)
>>>
>>>
>>> Regards,
>>>
>>>
>>>
>>> Pankaj Koti
>>>
>>> *Senior Software Engineer, *OSS Engineering Team.
>>> Location: Pune, India
>>>
>>> Timezone: Indian Standard Time (IST)
>>>
>>> Email: pankaj.k...@astronomer.io
>>>
>>> Mobile: +91 9730079985
>>>
>>>
>>> On Wed, Jun 28, 2023 at 11:47 AM Ephraim Anierobi <
>>> ephraimanier...@gmail.com>
>>> wrote:
>>>
 Congratulations!

 On Wed, Jun 28, 2023 at 7:15 AM Elad Kalif  wrote:

> Hello Airflow Community,
>
> I am happy to announce that the Project Management Committee (PMC)
>> for
> Apache Airflow has invited:
> *Vincent Beck* (Github: *vincbeck*)
> *Phani Kumar* (Github: *phanikumv*)
> *Maciej Obuchowski* (Github: *mobuchowski*)
>
> to become committers and we are pleased to announce that they have
> accepted.
>
> Congratulations and welcome aboard!
>
> Regards,
> Elad on behalf of Airflow PMC
>

>>>
>>


-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [VOTE] AIP-56 Extensible user management

2023-06-19 Thread Oliveira, Niko
+1 (binding)


From: Jarek Potiuk 
Sent: Monday, June 19, 2023 7:38:08 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-56 Extensible user management

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 (binding).

On Mon, Jun 19, 2023 at 4:12 PM Beck, Vincent 
wrote:

> Dear Airflow community,
>
> I would like to start a vote for "AIP-56 Extensible user management".
>
> You can find the AIP here:
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-56+Extensible+user+management
>
> Discussion thread:
> https://lists.apache.org/thread/ck8dsj5w82lvr0cpwr4wlptmydqwnsqc
>
> This is my (non-binding) +1, the vote will last until 2pm (UTC) on Monday
> 26th June.
>
> Please vote accordingly:
>
> [ ] + 1 approve
> [ ] + 0 no opinion
> [ ] - 1 disapprove with the reason
>
> Only votes from PMC members and committers are binding, but other members
> of the community are encouraged to check the AIP and vote with
> "(non-binding)".
>
> Thanks
>


Re: [VOTE] Release Airflow 2.6.2 from 2.6.2rc1

2023-06-13 Thread Oliveira, Niko
+1 (non-binding)

checked files, licenses, signatures, checksums, installation, and a few example 
dags.

Cheers,
Niko


From: Beck, Vincent 
Sent: Tuesday, June 13, 2023 12:32:39 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Release Airflow 2.6.2 from 2.6.2rc1

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 (non-binding)

Tested against my testing DAGs.

On 2023-06-13, 12:04 PM, "Pankaj Koti" mailto:pankaj.k...@astronomer.io.inva>LID> wrote:




+1 (non-binding)


Tested my set of changes and additionally ran a few DAGs.


Regards,






Pankaj Koti


*Senior Software Engineer, *OSS Engineering Team.
Location: Pune, India


Timezone: Indian Standard Time (IST)


Email: pankaj.k...@astronomer.io 


Mobile: +91 9730079985




On Tue, Jun 13, 2023 at 1:45 PM Elad Kalif mailto:elad...@apache.org>> wrote:


> Hey fellow Airflowers,
>
> I have cut Airflow 2.6.2rc1. This email is calling a vote on the release,
> which will last at least 72 hours, from Tuesday, June 13, 2023 at 08:15 AM
> UTC
> until Friday, June 16, 2023 at 08:15 AM UTC ,and until 3 binding +1 votes
> have been received.
>
> Vote countdown timer:
> https://www.timeanddate.com/countdown/to?iso=20230616T0815=1440 
> 
>
> Status of testing of the release is kept at
> https://github.com/apache/airflow/issues/31867 
> 
>
> Consider this my (binding) +1.
>
> Airflow 2.6.2rc1 is available at:
> https://dist.apache.org/repos/dist/dev/airflow/2.6.2rc1/ 
> 
>
> *apache-airflow-2.6.2-source.tar.gz* is a source release that comes with
> INSTALL instructions.
> *apache-airflow-2.6.2.tar.gz* is the binary Python "sdist" release.
> *apache_airflow-2.6.2-py3-none-any.whl* is the binary Python wheel "binary"
> release.
>
> Public keys are available at:
> https://dist.apache.org/repos/dist/release/airflow/KEYS 
> 
>
> Please vote accordingly:
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove with the reason
>
> Only votes from PMC members are binding, but all members of the community
> are encouraged to test the release and vote with "(non-binding)".
>
> The test procedure for PMCs and Contributors who would like to test this RC
> are described in
>
> https://github.com/apache/airflow/blob/main/dev/README_RELEASE_AIRFLOW.md 
> \#verify-the-release-candidate-by-pmcs
>
> Please note that the version number excludes the `rcX` string, so it's now
> simply 2.6.2. This will allow us to rename the artifact without modifying
> the artifact checksums when we actually release.
>
> Release Notes:
> https://github.com/apache/airflow/blob/2.6.2rc1/RELEASE_NOTES.rst 
> 
>
> Changes since 2.6.1:
> *Bug Fixes*:
>
> - Cascade update of TaskInstance to TaskMap table (#31445)
> - Fix Kubernetes executors detection of deleted pods (#31274)
> - Use keyword parameters for migration methods for mssql (#31309)
> - Control permissibility of driver config in extra from airflow.cfg
> (#31754)
> - Fixing broken links in openapi/v1.yaml (#31619)
> - Hide old alert box when testing connection with different value (#31606)
> - Add TriggererStatus to OpenAPI spec (#31579)
> - Resolving issue where Grid won't un-collapse when Details is
> collapsed (#31561)
> - Fix sorting of tags (#31553)
> - Add the missing ``map_index`` to the xcom key when skipping
> downstream tasks (#31541)
> - Fix airflow users delete CLI command (#31539)
> - Include triggerer health status in Airflow ``/health`` endpoint (#31529)
> - Remove dependency already registered for this task warning (#31502)
> - Use kube_client over default CoreV1Api for deleting pods (#31477)
> - Ensure min backoff in base sensor is at least 1 (#31412)
> - Fix ``max_active_tis_per_dagrun`` for Dynamic Task Mapping (#31406)
> - Fix error handling when pre-importing modules in DAGs (#31401)
> - Fix dropdown default and adjust tutorial to use 42 as default for
> proof (#31400)
> - Fix crash when clearing run with task from normal to mapped (#31352)
> - Make BaseJobRunner a generic on the job class (#31287)
> - Fix ``url_for_asset`` fallback and 404 on DAG Audit Log (#31233)
> - Don't present an undefined execution date (#31196)
> - Added spinner activity while the logs load (#31165)
> - Include rediss to the list of supported URL schemes (#31028)
> - Optimize scheduler by skipping "non-schedulable" DAGs (#30706)
> - Save scheduler execution time during search for queued dag_runs (#30699)
> - Fix ExternalTaskSensor to work correctly with task groups (#30742)
> - Fix 

Re: [ANNOUNCE] New committer: Pankaj Singh

2023-06-12 Thread Oliveira, Niko
Woo! Congrats Prankaj!


From: Ankit Chaurasia 
Sent: Monday, June 12, 2023 3:12:42 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][ANNOUNCE] New committer: Pankaj Singh

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congrats Pankaj!








On Tue, Jun 13, 2023 at 3:38 AM Hussein Awala  wrote:

> Congrats Pankaj!
>
> On Mon 12 Jun 2023 at 23:38, Jarek Potiuk  wrote:
>
> > Woohooo!
> >
> > On Mon, Jun 12, 2023 at 11:31 PM Beck, Vincent
>  > >
> > wrote:
> >
> > > Congrats!
> > >
> > > On 2023-06-12, 5:28 PM, "Kaxil Naik"  > > kaxiln...@apache.org>> wrote:
> > >
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > >
> > >
> > >
> > > Hello Airflow Community,
> > >
> > >
> > > I am happy to announce that the Project Management Committee (PMC) for
> > > Apache Airflow has invited *Pankaj Singh* (Github: *pankajastro*) to
> > become
> > > a committer and we are pleased to announce that he has accepted.
> > >
> > >
> > > Pankaj has played an instrumental role in adding deferrable operators
> to
> > > Airflow, helped first-time contributors on GitHub & Slack, reviewed
> 100s
> > of
> > > PRs and actively helped in testing release candidates of Airflow and
> > > providers.
> > >
> > >
> > > Congratulations *Pankaj*, and welcome aboard!
> > >
> > >
> > > Regards,
> > > Kaxil on behalf of Airflow PMC
> > >
> > >
> > >
> > >
> >
>


Re: [ANNOUNCE] New committer Hussein Awala

2023-04-11 Thread Oliveira, Niko
Congrats Hussein!


From: Sung Yun 
Sent: Tuesday, April 11, 2023 4:41:15 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][ANNOUNCE] New committer Hussein Awala

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congratulations Hussein!!

Sung

Sent from my iPhone

> On Apr 11, 2023, at 5:56 PM, Beck, Vincent  
> wrote:
>
> Congrats! __
>
> On 2023-04-11, 3:54 PM, "Pierre Jeambrun"  > wrote:
>
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
>
>
>
> Congratulations Hussein, well deserved !
>
>
>> On Tue 11 Apr 2023 at 23:51, Jarek Potiuk > > wrote:
>>
>>
>> Hello Airflow Community,
>>
>> I am happy to announce that the Project Management Committee
>> (PMC) for Apache Airflow has invited Hussein Awala
>> (github nickname hussein-awala) to become a
>> committer and we are pleased to announce that
>> he has accepted.
>>
>> Congratulations Hussein, and welcome!
>>
>> Jarek on behalf of Airflow PMC
>>
>
>
>
> B‹CB•È[œÝXœØÜšX™KK[XZ[ˆ]‹][œÝXœØÜšX™PZ\™›ݢ\XÚK›Ü™ÃB‘›ÜˆY][Û˜[ÛÛ[X[™ËK[XZ[ˆ]‹Z[Z\™›ݢ\XÚK›Ü™ÃB

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [DISCUSS] AIP-52 updates - setup / teardown tasks

2023-03-27 Thread Oliveira, Niko
Chiming in on a few of the topics discussed so far:

- Context managers:
I found most of the context manager syntax proposals a little hard to 
understand, but some better than others. Ultimately if I put my DAG author hat 
on, I find this declaration the most straightforward, clear and it's easy to 
update existing code: chain(create_notification_channel.as_setup(), ... other 
tasks ... 
delete_notification_channel.teardown_for(create_notification_channel),...)

- Short Circuit Behaviour

Again putting my DAG author hat on, the example Daniel gave about skipping 
teardown steps in the event of a critical failure (so that we can debug files, 
clusters, etc left behind) is exactly something my team is interested in, and 
we're trying to build these things ourselves. Having the ShortCircuitOperator 
support this (although perhaps not by default) would be fantastic and as a user 
I would like to see this functionality maintained.

- Multiple setup/teardown

I also agree that whether we ship them now or later, we must ensure we don't 
walk through any one-way doors for future implementation. It's great that we're 
having these discussions early.

Cheers,
Niko




From: Daniel Standish 
Sent: Monday, March 27, 2023 9:43:07 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSS] AIP-52 updates - setup / teardown tasks

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



>
> 1) I am not sure if we should make it private (I am not even sure what
> it would mean to be private:) ). But If it means that setting the rule
> type for non-teardown task should raise an error (and of course
> documenting this rule as only (and automatically) being applied to
> teardown task - I am good with it.


Yeah I have no concerns with throwing error if this rule is used "manually"
by user.  Like, at init time though, keeping it simple.  I.e. I think we
shouldn't boil the ocean to prevent a creative user from finding a way to
use it in some weird way.

2) Yes. The teardown must run in any case. I think documenting the
> behaviour in ShortCircuit as a special case and - especially - adding
> some examples showing that would be enough


Clarify Jarek, are you saying that we should allow ShortCircuit to "short
circuit" the teardowns [either by default or perhaps optionally with a
skip_teardowns_too param] and document with examples? Or are you saying
shortcircuit operator should be updated so it's not possible to use it to
skip a teardown?









On Mon, Mar 27, 2023 at 12:30 AM Jarek Potiuk  wrote:

> My view:
>
> 1) I am not sure if we should make it private (I am not even sure what
> it would mean to be private:) ). But If it means that setting the rule
> type for non-teardown task should raise an error (and of course
> documenting this rule as only (and automatically) being applied to
> teardown task - I am good with it.
>
> 2) Yes. The teardown must run in any case. I think documenting the
> behaviour in ShortCircuit as a special case and - especially - adding
> some examples showing that would be enough
>
> And yeah while it is indeed an implementation detail, it is somewhat
> "visible" as the public API is being concerned. So no wonder it came
> as a bit of a surprise while implementing setup/teardown (happens that
> something we consider as a detail for those deeply involved is a bit
> of a surprise for those that came to look at it at review time).  I
> guess this is somewhat a consequence of the way we operate in a
> distributed environment and some details are being discussed in
> smaller groups that are more focused on getting things done (we had a
> few of those for AIP-44 for one).
>
> But eventually we are discussing it now, so I think it is cool.
>
> J.
>
> On Mon, Mar 27, 2023 at 8:43 AM Ash Berlin-Taylor  wrote:
> >
> > If the set-up ran then the tear down _must_ run. No question.
> >
> > Nothing should be able to change this fact. If you can, then they don't
> fulfill the stated purpose of tear down tasks in the AIP: to tidy up
> resources created by a set up task.
> >
> > On 27 March 2023 06:22:47 BST, Daniel Standish
>  wrote:
> > >>
> > >> When user set setup/teardown he has no idea unique trigger rule is set
> > >> under the hood. The user also has no idea that trigger rules are even
> > >> involved. That is not something he sees unless he checks the code of
> > >> teardown and setup decorators.
> > >
> > >This means that users of ShortCircuitOperator will not even know they
> need
> > >> to take action (until it wont work as expexted) and they will
> propbably
> > >> start as asking questions.
> > >
> > >
> > >Yeah, short circuit operator is a special operator that, if you're
> going to
> > >use it, you ought to know how it works.  And we can easily add a note in
> > >the docs that emphasizes its behavior on this point.  But I should point
> > >out, the 

Re: [DISCUSS] AIP-55 Rule-based timetable with logical composition

2023-03-27 Thread Oliveira, Niko
I love this idea, it's definitely helpful!

I think an interesting topic to discuss for this project would be some kind of 
UI based date/calendar picker to help users construct these logical 
compositions. Something like `days("D1", "D2", "THU-SAT", "4>", "L1")` is quite 
inscrutable.
A UI component to at least visualize the composition you've created would be 
super powerful, and if it could be used to modify or create the compositions 
that would be an even better user experience.

Cheers,
Niko


From: Jarek Potiuk 
Sent: Sunday, March 26, 2023 2:52:16 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSS] AIP-55 Rule-based timetable with logical 
composition

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



I am not sure if it needs an AIP - just PR implementing it and
discussion there should be more than enough IMHO.

On Thu, Mar 23, 2023 at 10:51 AM Malthe  wrote:
>
> This AIP comes out of a previous discussion on skipping tasks based on
> a rule-based schedule, e.g. excluding holidays except if it's Monday.
>
> The central idea is to define a schedule based on logical composition
> (and, or, not) – using a small number of primitives.
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-55+Rule-based+timetable+with+logical+composition
>
> Cheers
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: Request for feedback on proposal for new OpenLineage provider in Airflow

2023-03-23 Thread Oliveira, Niko
I'd like to join as well! (oliveira...@gmail.com)


From: Igor Kholopov 
Sent: Wednesday, March 22, 2023 4:01:40 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Request for feedback on proposal for new OpenLineage 
provider in Airflow

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1, would be happy to join the session! (Please add either
ikholo...@google.com or kholopo...@gmail.com).

Best,
Igor

On Wed, Mar 22, 2023 at 11:27 PM Pierre Jeambrun 
wrote:

> Same here if you can add me please.
>
> Looking forward to this session.
>
> Le mer. 22 mars 2023 à 23:07, Mehta, Shubham  a
> écrit :
>
> > Please include me, I will try my best to join (shubhammehta...@gmail.com
> )
> >
> > Best,
> > Shubham
> >
> > On 2023-03-22, 2:24 PM, "Jarek Potiuk"  > ja...@potiuk.com>> wrote:
> >
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> >
> >
> >
> > There are some strange behaviours in the calendar entry - I think you
> > cannot add yourself, only guests can add others :)
> > I've added you Eugen, maybe if someone wants to be also added - please
> > post here with your gmail/calendar addresses.
> >
> >
> > J.
> >
> >
> > On Wed, Mar 22, 2023 at 9:56 PM Eugen Kosteev   > eu...@kosteev.com>> wrote:
> > >
> > > Hi Julien.
> > >
> > > Can you, please, include me there as well: eu...@kosteev.com  > eu...@kosteev.com> or
> > > kost...@google.com .
> > > Looking forward to see presentation.
> > >
> > > - Eugene
> > >
> > > On Wed, Mar 22, 2023 at 8:36 PM Julien Le Dem
>  > lid>
> > > wrote:
> > >
> > > > Hello all,
> > > > I have to move the OpenLineage presentation to next week.
> > > > Sorry for the change.
> > > > It will be Friday next week March 31st at 5pm CET 9am PT.
> > > >
> > > >
> >
> https://calendar.google.com/calendar/event?action=TEMPLATE=MTF1bHRrdTdrM29vMGZyamdzc2JuZWFkMHEganVsaWVuQGFzdHJvbm9tZXIuaW8=julien%40astronomer.io
> > <
> >
> https://calendar.google.com/calendar/event?action=TEMPLATEtmeid=MTF1bHRrdTdrM29vMGZyamdzc2JuZWFkMHEganVsaWVuQGFzdHJvbm9tZXIuaW8tmsrc=julien%40astronomer.io
> > >
> > > > Julien
> > > >
> > > > On Thu, Mar 16, 2023 at 8:21 PM Julien Le Dem  > >
> > > > wrote:
> > > >
> > > > > We are planning to do this session next Thursday at 5pm CET 9am
> PT. I
> > > > will
> > > > > send a zoom link in advance.
> > > > > Julien
> > > > >
> > > > > On Sat, Feb 25, 2023 at 05:59 Jarek Potiuk  > > wrote:
> > > > >
> > > > >> Cool. I am looking forward to it :). It would be great to get some
> > > > >> insight from those who attempted to get the lineage working in
> > several
> > > > >> versions of Open Lineage and finally arrived at the current
> > > > >> specs/integration.
> > > > >>
> > > > >> On Wed, Feb 22, 2023 at 7:02 PM Julien Le Dem
> > > > >> mailto:jul...@astronomer.io.inva>lid>
> > wrote:
> > > > >> >
> > > > >> > Thank you Jarek,
> > > > >> > I am happy to organize a zoom presentation about OpenLineage and
> > > > answer
> > > > >> any question. It is indeed a spec decoupling the data
> transformation
> > > > layer
> > > > >> from the Metadata store people are using. Just like OpenTelemetry
> > is for
> > > > >> service metrics/traces.
> > > > >> > Best,
> > > > >> > Julien
> > > > >> >
> > > > >> > On Tue, Feb 21, 2023 at 11:23 PM Jarek Potiuk  > >
> > > > wrote:
> > > > >> >>
> > > > >> >> And to add a little "parallel" - I think Open Lineage
> integration
> > > > >> replacing our "generic lineage" is very similar step to the new
> > > > >> "Multi-tenant"-ready authentication interface we are discussing in
> > > > >> https://lists.apache.org/thread/cc9dj680nwz494k8n51w6qqohzm4wgck
> <
> > https://lists.apache.org/thread/cc9dj680nwz494k8n51w6qqohzm4wgck>
> > > > >> >>
> > > > >> >> Yes - we have a generic authentication interface, but no - it's
> > > > >> useless for the case where multi-tenancy and good level of
> resource
> > > > >> authorization is needed. It's just far too simplistic and limited.
> > > > >> >>
> > > > >> >> Same with current lineage generic interface - yes, we have it
> but
> > > > it's
> > > > >> only useful in a limited set of cases. and if we want to
> step-it-up
> > we
> > > > need
> > > > >> to come up with something better (and Open Lineage happens to be
> one
> > > > that
> > > > >> has been developed with Airflow in mind and battle tested).
> > > > >> >>
> > > > >> >> J.
> > > > >> >>
> > > > >> >> On Wed, Feb 22, 2023 at 8:16 AM Jarek Potiuk  > >
> > > > wrote:
> > > > >> >>>
> > > > >> >>> Hey Rafał (Eugene, Michal - and others who are looking),
> > > > >> >>>
> > > > >> >>> I 

Re: [ANNOUNCE] New PMC member: Brent Bovenzi

2023-03-15 Thread Oliveira, Niko
Congrats Brent!!


From: Jorrick Sleijster 
Sent: Wednesday, March 15, 2023 9:09:07 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][ANNOUNCE] New PMC member: Brent Bovenzi

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congratulatuons Brent! It had always been a pleasure to work with you 

On Wed, 15 Mar 2023, 11:36 Igor Kholopov, 
wrote:

> Congrats, Brent!
>
> On Wed, Mar 15, 2023 at 10:47 AM Kaxil Naik  wrote:
>
> > Congrats Brent
> >
> > On Wed, 15 Mar 2023 at 08:18, Hussein Awala  wrote:
> >
> > > Congratulations Brent!
> > >
> > > 
> > > From: Pankaj Koti 
> > > Sent: Wednesday, March 15, 2023 7:56:44 AM
> > > To: dev@airflow.apache.org
> > > Subject: Re: [ANNOUNCE] New PMC member: Brent Bovenzi
> > >
> > > Congratulations, Brent!
> > >
> > > On Wed, 15 Mar 2023, 06:23 Josh Fell,  > .invalid>
> > > wrote:
> > >
> > > > Amazing, congrats Brent!
> > > >
> > > > On Tue, Mar 14, 2023 at 8:50 PM Jarek Potiuk 
> wrote:
> > > >
> > > > > Hello Airflow Community,
> > > > >
> > > > > I have the pleasure to announce that The Project Management
> Committee
> > > > > (PMC) for Apache Airflow has invited Brent Bovenzi to become Apache
> > > > > Airflow PMC Member and we are pleased to announce that he has
> kindly
> > > > > accepted it.
> > > > >
> > > > > Being a PMC member enables assistance with the management and to
> > guide
> > > > > the direction of the project.
> > > > >
> > > > > Congratulations Brent, very well deserved indeed.
> > > > >
> > > > > Regards,
> > > > > Jarek
> > > > > on behalf of Airflow PMC
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New PMC member: Pierre Jeambrun

2023-03-15 Thread Oliveira, Niko
Congrats Pierre, well deserved!


From: Kaxil Naik 
Sent: Wednesday, March 15, 2023 2:47:31 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][ANNOUNCE] New PMC member: Pierre Jeambrun

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Congrats Pierre 

On Wed, 15 Mar 2023 at 08:18, Hussein Awala  wrote:

> Congratulations Pierre!
>
> 
> From: Pankaj Koti 
> Sent: Wednesday, March 15, 2023 7:57:19 AM
> To: dev@airflow.apache.org
> Subject: Re: [ANNOUNCE] New PMC member: Pierre Jeambrun
>
> Congratulations, Pierre!
>
> On Wed, 15 Mar 2023, 11:01 Robert Karish,  wrote:
>
> > Congratulations Pierre!
> >
> > On Tue, Mar 14, 2023 at 8:53 PM Josh Fell  > .invalid>
> > wrote:
> >
> > > Well deserved indeed, congrats Pierre!
> > >
> > > On Tue, Mar 14, 2023 at 8:52 PM Jarek Potiuk  wrote:
> > >
> > > > Hello Airflow Community,
> > > >
> > > > I have the pleasure to announce that The Project Management Committee
> > > > (PMC) for Apache Airflow has invited Pierre Jeambrun to become Apache
> > > > Airflow PMC Member and we are pleased to announce that he has kindly
> > > > accepted it.
> > > >
> > > > Being a PMC member enables assistance with the management and to
> guide
> > > > the direction of the project.
> > > >
> > > > Congratulations Pierre, certainly deserved!
> > > >
> > > > Regards,
> > > > Jarek
> > > > on behalf of Airflow PMC
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >
> > > >
> > >
> >
>


Re: [VOTE] Airflow Providers prepared on March 07, 2023

2023-03-07 Thread Oliveira, Niko
+1 (non-binding). Checked files, install (using the newly updated pmc 
Dockerfile :D), license and checksum.


From: Jarek Potiuk 
Sent: Tuesday, March 7, 2023 1:53:26 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Airflow Providers prepared on March 07, 2023

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



+1 binding. Checks licence, signature, checksum.

On Tue, Mar 7, 2023 at 6:02 PM Kaxil Naik  wrote:
>
> +1 binding
>
> On Tue, 7 Mar 2023 at 13:04, Hussein Awala  wrote:
>
> > +1 (non-binding)
> >
> > 
> > From: Elad Kalif 
> > Sent: Tuesday, March 7, 2023 9:26:17 AM
> > To: dev@airflow.apache.org
> > Subject: [VOTE] Airflow Providers prepared on March 07, 2023
> >
> > Hey all,I have just cut ad hoc release for Airflow Hashicorp Provider
> > package (RC2). This email is calling a vote on the release, which will
> > last for 72 hours - which means that it will end on March 10, 2023
> > 08:30 AM UTC
> >
> > Consider this my (binding) +1.
> > Airflow Providers are available
> > at:
> > https://dist.apache.org/repos/dist/dev/airflow/providers/*apache-airflow-providers-
> > -*.tar.gz*
> > are the binary Python "sdist" release - they are also official
> > "sources" for the provider
> > packages.*apache_airflow_providers_-*.whl are the binary
> > Python "wheel" release.The test procedure for PMC members who would
> > like to test the RC candidates are described
> > inhttps://
> > github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-by-pmc-membersand
> > for Contributors:
> > https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributorsPublic
> > keys are available
> > at:https://dist.apache.org/repos/dist/release/airflow/KEYSPlease vote
> > accordingly:[ ] +1 approve[ ] +0 no opinion[ ] -1 disapprove with the
> > reasonOnly votes from PMC members are binding, but members of the
> > community are encouraged to test the release and vote with
> > "(non-binding)".Please note that the version number excludes the 'rcX'
> > string.This will allow us to rename the artifact without modifyingthe
> > artifact checksums when we actually release.The status of testing the
> > providers by the community is kept
> > here:https://github.com/apache/airflow/issues/29948You can find
> > packages as well as detailed changelog following the below links:
> >
> > https://pypi.org/project/apache-airflow-providers-hashicorp/3.3.0rc2/
> >
> >
> > Cheers,
> >
> > Elad Kalif
> >

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [NOTICE] Upcoming global changes to default GitHub Actions behavior for outside collaborators

2023-02-14 Thread Oliveira, Niko
I agree this is completely untenable, at least for Airflow. I commented on the 
Jira ticket as well with more thoughts.


Cheers,
Niko


From: Jarek Potiuk 
Sent: Monday, February 13, 2023 4:08:23 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][NOTICE] Upcoming global changes to default GitHub 
Actions behavior for outside collaborators


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Would be great to comment on the JiRA ticket. I think there is somewhat 
misunderstanding of the problem on the side of INFRA and i think we need to 
convince them they have not assessed the consequences properly

wt., 14 lut 2023, 01:02 użytkownik Pierre Jeambrun 
mailto:pierrejb...@gmail.com>> napisał:
Hello,

I share Jarek and Dennis' concerns.

It would be very hard to maintain enough responsiveness to not discourage 
external contributions while still trying to actually check the changes before 
approving a workflow.

We have hundreds of workflows a day (~150 - 200 in the last 24hours, it would 
be interesting to have an average number here). Even without internal 
contributions that would still leave a substantial amount to check, we divide 
that by the number of active committers and this is... terrifying.

I really hope that we can find another way to prevent GHA abuse.

Best Regards,
Pierre

Le lun. 13 févr. 2023 à 21:59, Jarek Potiuk 
mailto:ja...@potiuk.com>> a écrit :
For others who might also share the same concerns, my ticket where I
explain what effects it will have on our project, and in comment I
also respond to Greg's worries about stealing individual accounts.

https://issues.apache.org/jira/browse/INFRA-24200

Maybe for other projects it is not as important as it is for Airflow,
maybe the amount of traffic and outside contributors is not that bad -
and for those projects I think the policy might make sense.
But I strongly believe that for many projects that have a lot of
outside contributors it will have a similar effect as I believe it
will have for Airflow (and the goal of increased security will not be
achieved).

And I do not want to argue, Greg, nor shout at anyone (so just
anticipating, I would really appreciate not shouting at me for raising
a yellow flag).

I am not saying that it is all "wrong" and making a revolution. I just
think that you should reconsider the policy of disabling it for
everyone and then "justifying" why you need an exception rather than
just (how it was so far) choosing appropriate policy via .asf.yml.

I believe the reasons everyone will mention in their tickets will be
similar to ours and maybe, just maybe, simply leaving it up to a
project to control the policy (with default "require approval") is
much better than top-bottom forcing it and expecting some kind of
justification.

Quoting a person from my project:

'Yeah, that sounds like a really bad decision for our workflow.  It
makes me wonder how other projects are handling their workflow if this
doesn't break them.  I can only see this working for a small team who
are all/mostly committers and rarely get outside contributions.`


J.


On Mon, Feb 13, 2023 at 9:26 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
>
> Surely. I will.
>
> On Mon, Feb 13, 2023 at 9:01 PM Greg Stein 
> mailto:gst...@gmail.com>> wrote:
> >
> > 1. JohnDoe submits a PR, and somebody on the PMC flips the bit to allow GHA 
> > to run now and in the future.
> > 2. BlackHat steals JohnDoe's credentials
> > 3. BlackHat submits a PR to mine crypto. GHA starts running before any 
> > human can stop it.
> >
> > Explain how to correct that in your ticket.
> >
> > Cheers,
> > -g
> >
> >
> > On Mon, Feb 13, 2023 at 1:56 PM Jarek Potiuk 
> > mailto:ja...@potiuk.com>> wrote:
> >>
> >> I will raise a ticket and explain.
> >>
> >> But This would be a huge blow to the Airflow community and almost
> >> immediate burn-out of the active committers if it goes life for
> >> Airflow. And likely many other projects.
> >>
> >> I am very strongly convinced it should not be enforced.
> >>
> >> J.
> >>
> >> On Mon, Feb 13, 2023 at 8:51 PM Daniel Gruno 
> >> mailto:humbed...@apache.org>> wrote:
> >> >
> >> > To Project PMCs:
> >> >
> >> > GitHub for Apache projects is currently set to allow a non-committer
> >> > contributor to use GitHub Actions if a previous pull request by that
> >> > person has been approved.
> >> >
> >> > This has raised some security concerns, and could cause issues with
> >> > overall use and availability of GitHub Actions.
> >> >
> >> > The Infrastructure Team proposes to change the default to “always
> >> > require approval for external contributors”. We intend to make this
> >> > change on Sunday the 19th of March, 2023.
> >> >
> >> > This change will apply to all GitHub repositories that do not already
> >> > have a specific GitHub Actions policy set.
> >> >
> >> > Projects that have a 

Re: [VOTE] AIP-53 OpenLineage in Airflow

2023-02-13 Thread Oliveira, Niko
+1 (binding)
Overall I think this will make future development and growth for OL in Airflow 
much easier which will hopefully lead to more adoption!


From: Vikram Koka 
Sent: Monday, February 13, 2023 8:20:23 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-53 OpenLineage in Airflow


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 binding.
I have been looking at the doc and having lineage integrated with Airflow as a 
provider makes sense to me.


On Mon, Feb 13, 2023 at 2:38 AM Kaxil Naik 
mailto:kaxiln...@gmail.com>> wrote:
+1 binding , this should make lineage a first-class citizen for Airflow users. 
Excited for this one

On Sun, 12 Feb 2023 at 07:57, Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
A little side-track., small comment to what Shubham wrote

Yeah. I also noticed AIP-47 mentioned - but I considered that
implementation detail. I read that those will be rather regular unit
tests (so not reaching out to external systems as it makes little
sense and we definitely want to make open-lineage tests run regularly
with every PR - otherwise we would end up in the same boat as
currently where the repos are separated out), I believe the AIP-47
mentioned there was more an attempt to say "the tests coverage will be
high". Julian, am I right ?

On Sat, Feb 11, 2023 at 11:57 PM Mehta, Shubham
 wrote:
>
> +1 non-binding. I'll be on the lookout for initial PRs to learn more about 
> the implementation details of how System Tests will be extended to cover 
> these changes, as well as the ongoing maintenance required from providers. 
> The proposed changes should definitely make it easier for Airflow customers 
> to adopt lineage and improve stability. I'm looking forward to seeing how 
> customers will end up using it!
>
>
> Shubham
>
>
>
> From: Julien Le Dem 
> Reply-To: "dev@airflow.apache.org" 
> mailto:dev@airflow.apache.org>>
> Date: Friday, February 10, 2023 at 3:28 PM
> To: "dev@airflow.apache.org" 
> mailto:dev@airflow.apache.org>>
> Subject: [EXTERNAL] [VOTE] AIP-53 OpenLineage in Airflow
>
>
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> Dear Airflow community,
>
>
>
> Following the discussion thread over the past few weeks, I'd like to call a 
> vote on AIP-53 OpenLineage in Airflow:
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-53+OpenLineage+in+Airflow
>
>
>
> The discussion thread is linked in the confluence doc if you wish to consult 
> the history of the conversation. Thank you to all who contributed!
>
>
>
> This is my (non-binding!) +1, the vote will last until midnight (UTC) on 
> Friday 17th February.
>
>
>
> Thanks,
>
> Julien
>
>
>
> For reference, the Motivation section in the doc:
>
> Operational lineage collection is a common need to understand dependencies 
> between data pipelines and track end-to-end provenance of data. It enables 
> many use cases from ensuring reliable delivery of data through observability 
> to compliance and cost management.
>
> Publishing operational lineage is a core Airflow capability to enable 
> troubleshooting and governance.
>
> OpenLineage is a project part of the LFAI foundation that provides a 
> spec standardizing operational lineage collection and sharing across the data 
> ecosystem. If it provides plugins for popular open source projects, its 
> intent is very similar to OpenTelemetry (also under the Linux Foundation 
> umbrella): to remain a spec for lineage exchange that projects - open source 
> or proprietary - implement.
>
> Built-in OpenLineage support in Airflow will make it easier and more reliable 
> for Airflow users to publish their operational lineage through the 
> OpenLineage ecosystem.
>
> The current external plugin maintained in the OpenLineage project depends on 
> Airflow and operators internals and gets broken when changes are made on 
> those. Having a built-in integration ensures a better first class support to 
> expose lineage that gets tested alongside other changes and therefore is more 
> stable.
>
> Today, OpenLineage consumers in the ecosystem include: Egeria (bank 
> compliance), Marquez (build your own metadata platform for compliance for 
> example), Microsoft Purview (Governance, …), Astro (data observability), 
> Amundsen. AWS recently blogged about using OpenLineage in the AWS ecosystem. 
> Other projects are at various levels of progress.
>
> On the producer side, there is support for open source projects like Airflow, 
> dbt, Spark, Flink, GreatExpectations and proprietary warehouses like 
> Snowflake, BigQuery, Redshift through API integration or SQL parsing.
>
> Examples of users talking about their usage of OpenLineage 

Re: [VOTE] Airflow Providers prepared on February 08, 2023

2023-02-09 Thread Oliveira, Niko
+1 (non-binding).
Installed the amazon provider rc and tested my tagged PR (and another while I 
was at it).
Also checked svn file counts, installation, checksums and licenses


From: Jarek Potiuk 
Sent: Thursday, February 9, 2023 12:31:20 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Airflow Providers prepared on February 08, 2023


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 (binding). Tested signatures, checksums, licences.

On Thu, Feb 9, 2023 at 1:18 PM Pankaj Koti  
wrote:
+1 non-binding

Tested the Google cloud RC 
https://pypi.org/project/apache-airflow-providers-google/8.9.0rc1/ and my PR 
https://github.com/apache/airflow/pull/29349 which is part of it. DAG runs fine 
in both synchronous and asynchronous modes.

Regards,

[https://lh6.googleusercontent.com/6XZGFMFxFRzjRMWKjBsPhmsnfxrc5DbwlZPJ9jQp2OSJPuuIicb3zCmNLxsxGGO2f9mqtjQSGNuuMGmYE_9rujp19Efho5SoLa1hjaZ5fgDkA3idNNwwpYOIk-ANnd8-kKKCuw12]



Pankaj Koti

Senior Software Engineer, OSS Engineering Team.
Location: Pune, India

Timezone: Indian Standard Time (IST)

Email: pankaj.k...@astronomer.io

Mobile: +91 9730079985



On Wed, Feb 8, 2023 at 2:51 PM Elad Kalif 
mailto:elad...@apache.org>> wrote:

Hey all,

I have just cut the new wave Airflow Providers packages. This email is calling 
a vote on the release,
which will last for 72 hours - which means that it will end on February 11, 
2023 09:20 AM UTC

Consider this my (binding) +1.

Airflow Providers are available at:
https://dist.apache.org/repos/dist/dev/airflow/providers/

*apache-airflow-providers--*.tar.gz* are the binary
 Python "sdist" release - they are also official "sources" for the provider 
packages.

*apache_airflow_providers_-*.whl are the binary
 Python "wheel" release.

The test procedure for PMC members who would like to test the RC candidates are 
described in
https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-by-pmc-members

and for Contributors:

https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributors


Public keys are available at:
https://dist.apache.org/repos/dist/release/airflow/KEYS

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason


Only votes from PMC members are binding, but members of the community are 
encouraged to test the release and vote with "(non-binding)".

Please note that the version number excludes the 'rcX' string.
This will allow us to rename the artifact without modifying
the artifact checksums when we actually release.

The status of testing the providers by the community is kept here:
https://github.com/apache/airflow/issues/29424

You can find packages as well as detailed changelog following the below links:

https://pypi.org/project/apache-airflow-providers-amazon/7.2.0rc1/
https://pypi.org/project/apache-airflow-providers-apache-beam/4.2.0rc1/
https://pypi.org/project/apache-airflow-providers-apache-hive/5.1.2rc1/
https://pypi.org/project/apache-airflow-providers-arangodb/2.1.1rc1/
https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/5.2.0rc1/
https://pypi.org/project/apache-airflow-providers-dbt-cloud/3.0.0rc1/
https://pypi.org/project/apache-airflow-providers-elasticsearch/4.4.0rc1/
https://pypi.org/project/apache-airflow-providers-ftp/3.3.1rc1/
https://pypi.org/project/apache-airflow-providers-google/8.9.0rc1/
https://pypi.org/project/apache-airflow-providers-microsoft-azure/5.2.0rc1/
https://pypi.org/project/apache-airflow-providers-mysql/4.0.1rc1/
https://pypi.org/project/apache-airflow-providers-presto/4.2.2rc1/
https://pypi.org/project/apache-airflow-providers-sftp/4.2.2rc1/
https://pypi.org/project/apache-airflow-providers-snowflake/4.0.3rc1/
https://pypi.org/project/apache-airflow-providers-tableau/4.1.0rc1/
https://pypi.org/project/apache-airflow-providers-trino/4.3.2rc1/



Cheers, Elad Kalif


Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-29 Thread Oliveira, Niko
I would love to see this. I think it would legitimize the interface a bit more 
and also help to encourage folks to not abuse/leak it in the future.

AIP-51 is close to completion, I'd say 80%. We've boiled off most of the easier 
items and what's left is a few tricky decouplings (I have a PR for one of the 
trickier ones here [1] if anyone has some free time :)).

There are also lots of small items that aren't covered by the AIP, such as the 
one Andrey called out, which are not necessarily leaked abstractions from the 
base interface but are in a roundabout way still coupled to a particular 
executor or execution environment. There will be a long tail of these and we 
should start tracking them (perhaps a GH label?).

Cheers,
Niko

[1]: https://github.com/apache/airflow/pull/29055


From: Jarek Potiuk 
Sent: Sunday, January 29, 2023 1:20:38 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [DISCUSSION] Move K8S and Celery Executors (and related) to 
respective providers?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hello Everyone,

As a follow-up to AIP-51 - when it is completed (with few more quirks
like the one described by Andrey in the "Rendered Task Instance
Fields" discussion) - it should now be entirely possible to move
Kubernetes Executor, Celery Executor and related "joined" executors to
respective providers.

There is of course question where to move CeleryKubernetesExecutor,
but we have prior art with Transfer Operators and we might choose to
add it to "cncf.kubernetes" and make the celery provider an optional
dependency of K8S provider.

This has multiple advantages, and one of the biggest one I can think
of is that we could evolve K8S executor faster than Airflow itself and
we would avoid quite a lot of complexities involved in parallel
modifications of K8S code in provider and executor (it happened in the
past that we had do to add very high min-airflow version for
"cncf.kubernetes" provider in order to make this happens, and we could
likely avoid similar problems by making sure K8S executor is used as
"yet another executor" - compliant with the AIP-51 interface and we
could evolve it much faster.

That would also give celery provider some "real" meaning - so far
celery provider was merely exposing celery queue sensor, but when we
move CeleryExecutor to the provider, the provider would finally be a
useful one.

Doing this requires a few small things:
* we likely need to add dynamic import in the old "executors" package
following PEP 562 (and the way we've done operators) and deprecation
notice in case someone uses them from there.
* we likely need to add "cncf.kubernetes" and "celery" packages as
pre-installed providers - alongside http, fttp, common.sql, sqlite,
imap.

I **think**, after AIP-51 gets implemented, this would be a fully
backwards-compatible change - it's just a matter of proper management
of the dependencies. We could add min-cncf.kubernetes/celery versions
in Airflow 2.6, so that we are sure 2.6+ airflow uses only newer
providers and kubernetes/celery code and this would be fully
transparent for the users, I believe.

J.


Re: [VOTE] Airflow Providers prepared on January 14, 2023

2023-01-16 Thread Oliveira, Niko
> One comment Niko - for software release voting only PMC votes are binding :)

Oops, sorry, too eager XD


From: Jarek Potiuk 
Sent: Monday, January 16, 2023 11:17:38 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Airflow Providers prepared on January 14, 2023


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


One comment Niko - for software release voting only PMC votes are binding :)

+1 (binding) - tested signatures, checksums, licences. all looks good.

On Mon, Jan 16, 2023 at 7:42 PM Oliveira, Niko  
wrote:

+1 binding

Our system tests show the AWS provider package as stable, a couple known flakey 
tests but we have regular passes with green across the board (public dashboard 
coming very soon!)


From: Pankaj Singh mailto:ags.pankaj1...@gmail.com>>
Sent: Monday, January 16, 2023 10:20:00 AM
To: dev@airflow.apache.org<mailto:dev@airflow.apache.org>
Subject: RE: [EXTERNAL][VOTE] Airflow Providers prepared on January 14, 2023


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 (non-binding)

On Mon, Jan 16, 2023 at 6:12 PM Rajath Srinivasaiah 
 wrote:
+1 (non-binding)
Our example DAGs ran fine.


Thanks and Regards,
Rajath S
rajath.srinivasa...@astronomer.io<mailto:rajath.srinivasa...@astronomer.io>

[image.png]


On Mon, Jan 16, 2023 at 5:51 PM Pankaj Koti  
wrote:
+1 (non-binding)


Our example DAGs ran fine for the following 7 RCs:

https://pypi.org/project/apache-airflow-providers-amazon/7.1.0rc1/
<https://pypi.org/project/apache-airflow-providers-apache-beam/4.1.1rc1/>https://pypi.org/project/apache-airflow-providers-apache-hive/5.1.1rc1/
<https://pypi.org/project/apache-airflow-providers-apache-impala/1.0.0rc1/>https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/5.1.1rc1/
<https://pypi.org/project/apache-airflow-providers-common-sql/1.3.3rc1/>https://pypi.org/project/apache-airflow-providers-dbt-cloud/2.3.1rc1/
https://pypi.org/project/apache-airflow-providers-google/8.8.0rc1/
https://pypi.org/project/apache-airflow-providers-http/4.1.1rc1/
<https://pypi.org/project/apache-airflow-providers-jenkins/3.2.0rc1/>https://pypi.org/project/apache-airflow-providers-microsoft-azure/5.1.0rc1/


Regards,

[https://lh6.googleusercontent.com/6XZGFMFxFRzjRMWKjBsPhmsnfxrc5DbwlZPJ9jQp2OSJPuuIicb3zCmNLxsxGGO2f9mqtjQSGNuuMGmYE_9rujp19Efho5SoLa1hjaZ5fgDkA3idNNwwpYOIk-ANnd8-kKKCuw12]



Pankaj Koti

Senior Software Engineer, OSS Engineering Team.
Location: Pune, India

Timezone: Indian Standard Time (IST)

Email: pankaj.k...@astronomer.io<mailto:pankaj.k...@astronomer.io>

Mobile: +91 9730079985



On Sat, Jan 14, 2023 at 7:04 PM Elad Kalif 
mailto:elad...@apache.org>> wrote:

Hey all,

I have just cut the new wave Airflow Providers packages. This email is calling 
a vote on the release,
which will last for 72 hours - which means that it will end on January 17, 2023 
01:30 PM UTC

Consider this my (binding) +1.

Airflow Providers are available at:
https://dist.apache.org/repos/dist/dev/airflow/providers/

*apache-airflow-providers--*.tar.gz* are the binary
 Python "sdist" release - they are also official "sources" for the provider 
packages.

*apache_airflow_providers_-*.whl are the binary
 Python "wheel" release.

The test procedure for PMC members who would like to test the RC candidates are 
described in
https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-by-pmc-members

and for Contributors:

https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributors


Public keys are available at:
https://dist.apache.org/repos/dist/release/airflow/KEYS

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason


Only votes from PMC members are binding, but members of the community are 
encouraged to test the release and vote with "(non-binding)".

Please note that the version number excludes the 'rcX' string.
This will allow us to rename the artifact without modifying
the artifact checksums when we actually release.

The status of testing the providers by the community is kept here:
https://github.com/apache/airflow/issues/28938

You can find packages as well as detailed changelog following the below links:


https://pypi.org/project/apache-airflow-providers-amazon/7.1.0rc1/
https://pypi.org/project/apache-airflow-providers-apache-beam/4.1.1rc1/
https://pypi.org/project/apache-airflow-providers-apache-hive/5.1.1rc1/
https://pypi.org/project/apache-airflow-providers-apache-impala/1.0.0rc1/
https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/5.1.1rc1

Re: [VOTE] Airflow Providers prepared on January 14, 2023

2023-01-16 Thread Oliveira, Niko
+1 binding

Our system tests show the AWS provider package as stable, a couple known flakey 
tests but we have regular passes with green across the board (public dashboard 
coming very soon!)


From: Pankaj Singh 
Sent: Monday, January 16, 2023 10:20:00 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Airflow Providers prepared on January 14, 2023


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 (non-binding)

On Mon, Jan 16, 2023 at 6:12 PM Rajath Srinivasaiah 
 wrote:
+1 (non-binding)
Our example DAGs ran fine.


Thanks and Regards,
Rajath S
rajath.srinivasa...@astronomer.io

[image.png]


On Mon, Jan 16, 2023 at 5:51 PM Pankaj Koti  
wrote:
+1 (non-binding)


Our example DAGs ran fine for the following 7 RCs:

https://pypi.org/project/apache-airflow-providers-amazon/7.1.0rc1/
https://pypi.org/project/apache-airflow-providers-apache-hive/5.1.1rc1/
https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/5.1.1rc1/
https://pypi.org/project/apache-airflow-providers-dbt-cloud/2.3.1rc1/
https://pypi.org/project/apache-airflow-providers-google/8.8.0rc1/
https://pypi.org/project/apache-airflow-providers-http/4.1.1rc1/
https://pypi.org/project/apache-airflow-providers-microsoft-azure/5.1.0rc1/


Regards,

[https://lh6.googleusercontent.com/6XZGFMFxFRzjRMWKjBsPhmsnfxrc5DbwlZPJ9jQp2OSJPuuIicb3zCmNLxsxGGO2f9mqtjQSGNuuMGmYE_9rujp19Efho5SoLa1hjaZ5fgDkA3idNNwwpYOIk-ANnd8-kKKCuw12]



Pankaj Koti

Senior Software Engineer, OSS Engineering Team.
Location: Pune, India

Timezone: Indian Standard Time (IST)

Email: pankaj.k...@astronomer.io

Mobile: +91 9730079985



On Sat, Jan 14, 2023 at 7:04 PM Elad Kalif 
mailto:elad...@apache.org>> wrote:

Hey all,

I have just cut the new wave Airflow Providers packages. This email is calling 
a vote on the release,
which will last for 72 hours - which means that it will end on January 17, 2023 
01:30 PM UTC

Consider this my (binding) +1.

Airflow Providers are available at:
https://dist.apache.org/repos/dist/dev/airflow/providers/

*apache-airflow-providers--*.tar.gz* are the binary
 Python "sdist" release - they are also official "sources" for the provider 
packages.

*apache_airflow_providers_-*.whl are the binary
 Python "wheel" release.

The test procedure for PMC members who would like to test the RC candidates are 
described in
https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-the-release-by-pmc-members

and for Contributors:

https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributors


Public keys are available at:
https://dist.apache.org/repos/dist/release/airflow/KEYS

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason


Only votes from PMC members are binding, but members of the community are 
encouraged to test the release and vote with "(non-binding)".

Please note that the version number excludes the 'rcX' string.
This will allow us to rename the artifact without modifying
the artifact checksums when we actually release.

The status of testing the providers by the community is kept here:
https://github.com/apache/airflow/issues/28938

You can find packages as well as detailed changelog following the below links:


https://pypi.org/project/apache-airflow-providers-amazon/7.1.0rc1/
https://pypi.org/project/apache-airflow-providers-apache-beam/4.1.1rc1/
https://pypi.org/project/apache-airflow-providers-apache-hive/5.1.1rc1/
https://pypi.org/project/apache-airflow-providers-apache-impala/1.0.0rc1/
https://pypi.org/project/apache-airflow-providers-cncf-kubernetes/5.1.1rc1/
https://pypi.org/project/apache-airflow-providers-common-sql/1.3.3rc1/
https://pypi.org/project/apache-airflow-providers-dbt-cloud/2.3.1rc1/
https://pypi.org/project/apache-airflow-providers-docker/3.4.1rc1/
https://pypi.org/project/apache-airflow-providers-elasticsearch/4.3.3rc1/
https://pypi.org/project/apache-airflow-providers-exasol/4.1.3rc1/
https://pypi.org/project/apache-airflow-providers-google/8.8.0rc1/
https://pypi.org/project/apache-airflow-providers-http/4.1.1rc1/
https://pypi.org/project/apache-airflow-providers-jenkins/3.2.0rc1/
https://pypi.org/project/apache-airflow-providers-microsoft-azure/5.1.0rc1/
https://pypi.org/project/apache-airflow-providers-microsoft-psrp/2.2.0rc1/
https://pypi.org/project/apache-airflow-providers-mongo/3.1.1rc1/
https://pypi.org/project/apache-airflow-providers-mysql/4.0.0rc1/

Re: [VOTE] AIP-52 Automatic setup and teardown tasks

2023-01-09 Thread Oliveira, Niko
+1 (binding)


Cheers,
Niko


From: Kevin Yang 
Sent: Monday, January 9, 2023 1:46:00 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-52 Automatic setup and teardown tasks


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


+1 (binding)

On Mon, Jan 9, 2023 at 1:41 PM Felix Uellendall  wrote:
+1 (binding)

-Felix


Sent from Proton Mail for iOS


On Mon, Jan 9, 2023 at 22:05, Pierre Jeambrun 
mailto:pierrejb...@gmail.com>> wrote:
+1 (binding)

Le lun. 9 janv. 2023 à 21:12, Vikram Koka  a 
écrit :
+1 binding

Vikram


On Mon, Jan 9, 2023 at 11:23 AM Ping Zhang 
mailto:pin...@umich.edu>> wrote:
+1 binding

Thanks,

Ping


On Mon, Jan 9, 2023 at 11:22 AM Ephraim Anierobi 
mailto:ephraimanier...@gmail.com>> wrote:
+1 binding

On Mon, 9 Jan 2023 at 19:55, Frank Cash 
mailto:cash.fra...@gmail.com>> wrote:
+1 (non-binding)

On Mon, Jan 9, 2023 at 1:34 PM Josh Fell  
wrote:
+1 binding

On Mon, Jan 9, 2023 at 12:51 PM Drew Hubl  
wrote:
+1 (non-binding)

On Jan 9, 2023, at 10:10 AM, Elad Kalif 
mailto:elad...@apache.org>> wrote:

+1 (binding)

On Mon, Jan 9, 2023 at 7:07 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
+1 (binding)

On Mon, Jan 9, 2023 at 6:01 PM Ferruzzi, Dennis  
wrote:

+1 non-binding


From: Ash Berlin-Taylor mailto:a...@apache.org>>
Sent: Monday, January 9, 2023 8:27 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [VOTE] AIP-52 Automatic setup and teardown tasks

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hello everyone,

I am calling for a vote on AIP-52 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-52+Automatic+setup+and+teardown+tasks
There haven't been any notable changes to the original document, mostly just 
clairifications to the proposal.

This is my +1, and the vote will last until 10am (UTC) on Monday 16th January.

Thanks,
Ash
(To remind people what this is about, here is the example from the first dicuss 
email)

```
from airflow import DAG, task, setup, teardown


with DAG(dag_id='test'):
@setup
def create_cluster():
...
return cluster_id

@task
def load(ti):
# Example:
cluster_id = ti.xcom_pull(task_id="create_cluster")

def summarize():
...

@teardown(on_failure_fail_dagrun=False)
def teardown_cluster():
...
cluster_id = ti.xcom_pull(task_id="create_cluster")

create_cluster()
load() >> summarize()
teardown_cluster()
```




--
Charles Frank Cash
https://github.com/frankcash
https://keybase.io/frankcash



Re: [VOTE] AIP-50 Trigger DAG UI Extension with Flexible User Form Concept

2023-01-05 Thread Oliveira, Niko
+1 binding

I really like this one, happy to see it coming along nicely!

Cheers,
Niko



From: Tomasz Urbaszek 
Sent: Friday, December 30, 2022 4:20 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] AIP-50 Trigger DAG UI Extension with Flexible 
User Form Concept


+1 binding,I like the idea!

T.

On Sat, 24 Dec 2022 at 15:45, Pierre Jeambrun 
mailto:pierrejb...@gmail.com>> wrote:
+1 binding

On Sat 24 Dec 2022 at 12:09, Xinbin Huang 
mailto:b...@apache.org>> wrote:

+1 binding

On Fri, Dec 23, 2022 at 5:43 PM Kaxil Naik 
mailto:kaxiln...@gmail.com>> wrote:
+1 binding

On Fri, 23 Dec 2022 at 16:09, Constance Martineau 
 wrote:
I somehow missed the discussion for this earlier. I can't comment on 
implementation, but the feature itself is really cool and such a useful 
addition! This is a bit presumptuous since there hasn't been a vote yet, but 
I'm looking forward to seeing this officially part of the project :)

On Fri, Dec 23, 2022 at 6:26 AM Scheffler Jens (XC-DX/ETV5) 
 wrote:
Hi Airflow Developers,

sorry, new to the process after discussion we had in previous emails in 
https://lists.apache.org/thread/kxkctcbh9drfw065dgvr673zl0xyfl3r and the 
Confluence in 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-50+Trigger+DAG+UI+Extension+with+Flexible+User+Form+Concept
 I collected some non-binding (positive) feedback so far. I am a bit late to 
ask the devlist for a VOTE to progress.

In October I went ahead and created some PoC for an implementation in 
https://github.com/apache/airflow/pull/27063.
After investing some time the last days, the AIP-50 proposal in the PR would 
now even be complete, so you can check-out the branch and have a preview there. 
Only 821 LoC added for a cool feature 
(Of course knowing it is still on FAB, but implementation can be converted to 
React later similar when AIP-38 progresses)

Hope proposal is accepted, happy to get feedback.

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Deterministik open Loop (XC-DX/ETV5)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | 

 
GERMANY
 | www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | Threema / Threema Work: 
KKTVR3F4 | jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; Geschäftsführung: 
Dr. Stefan Hartung,
Dr. Christian Fischer, Filiz Albrecht, Dr. Markus Forschner, Dr. Markus Heyn, 
Rolf Najork


--

[https://lh6.googleusercontent.com/GwRD9MeN5tJuc9TL1lOycvfwQ2RIku4vyNFzedt70iJiK0g7ZdR1Tv2VLcan45PH_rxyaVnH1y9gvuVithlXDGfdNCCuW7zS1NOWHpLQk19IWNE9eFGWKXHy8-MzSl6l8Uwi2mJJ]

Constance Martineau
Product Manager

Email: consta...@astronomer.io
Time zone: US Eastern (EST UTC-5 / EDT UTC-4)


[https://lh6.googleusercontent.com/7gbz9r6nIO44xJPr6UTl7noWhYLQYOJdX0jSYkkHfRV9hzDVoKdLE0ARgZ09NdvLQSMRxAxRz3GADzP-n--leVX9C_5TMH_0oe04OWKd9w_GImGPgQ5fdKX9SrAaYTEiSI10bdOw]



Re: [VOTE] New Provider: Cloudera

2022-12-09 Thread Oliveira, Niko
This has yet to be published by Google and Amazon - I know they are progressing 
a lot on making the automation and publishing regular result of the System 
tests from main in the way that we can verify that all tests pass - all that is 
done outside of the community resources and maintenance (i.e. this is entirely 
on the Amazon and Google teams to run and publish those tests).

Just as an update: we're still working hard on this, I promise :) It has taken 
MUCH longer than expected to get all the requisite internal approvals and 
agreement on how to share the results with the community. But we're zeroing in 
on an approach that everyone agrees on for publication. Please bear with us on 
this one!



In this case - anyone with any "Cloudera" account should be able to run it 
locally when contributing. But the idea of AIP-47 was to off-load regular 
execution of those tests and provide public "status" of those to those teams of 
those service providers that want to make sure that their provider still runs.

Agreed, the tests are written in a way that anyone can run them (with 
mechanisms to provide any pre-exisitng resources some tests required). But to 
expect the community to have the resources to regularly run the all the system 
tests for all providers is unreasonable, collaboration is really required here.

Cheers,
Niko




From: Pierre Jeambrun 
Sent: Friday, December 9, 2022 11:43:21 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] New Provider: Cloudera


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Thanks for taking time to give more details Jarek. This puts things in 
perspective.

Le ven. 9 déc. 2022 à 18:48, Collin McNulty  a 
écrit :
I concur with the concerns raised by Ash. Cloudera seems like an organization 
quite well suited to releasing its own provider. If such an organization is not 
expected to release outside the Apache process, who is? Maybe I'm 
misunderstanding, but I thought that the idea was that providers going forward 
would be mostly third party which allows for a larger and more vibrant 
ecosystem.

Collin McNulty

On Fri, Dec 9, 2022 at 6:04 AM Pierre Jeambrun 
mailto:pierrejb...@gmail.com>> wrote:
Hello,

I am really excited about a public official cloudera provider for airflow. This 
would be a great addition to the airflow ecosystem.

System tests would be an additional layer that would be great for the CI and 
release process, but would individual contributors be able to run these system 
tests locally ? From what I understand, such credentials would be stored in the 
CI, and only people with their own credentials would be able to test the code 
locally and therefore realistically help in maintaining the provider. 
(Iterating on CI failure wouldn't be great :p)

Ash point is echoing in me, remembering when I had to work on a specific 
provider where free accounts/quotas were not available. It was basically a shot 
in the dark, making code changes based on documentation and api specs without 
being able to actually test the code. Maybe this was the reason the issues 
stayed open for more than a year without being picked up.

Will the community really be able to contribute and support the provider, while 
most of us don't have a paid account ? Or is it 'stakeholders' maintained and 
'community' released at most. (Even reviewing code for release would be tricky 
without an account).

Maybe I misunderstood something and apologize in advance.

Best regards,
Pierre

Le ven. 9 déc. 2022 à 12:24, Jarek Potiuk 
mailto:ja...@potiuk.com>> a écrit :
> My concern about how we will actually test it works given we'd need a 
> cloudera account/install/instance would be good to comment on though.

This is a very good point Ash and I love you've made it as I think we have a 
very good solution at hand.

This simply calls for Cloudera's commitment to work on AIP-47 style tests and 
providing a test bed for that.

This has yet to be published by Google and Amazon - I know they are progressing 
a lot on making the automation and publishing regular result of the System 
tests from main in the way that we can verify that all tests pass - all that is 
done outside of the community resources and maintenance (i.e. this is entirely 
on the Amazon and Google teams to run and publish those tests).

So I have a PROPOSAL (I can send a formal vote on that shortly)

For all the future (starting from Cloudera) we should make that as a 
requirement that any of the providers accepted by the community MUST have 
AIP-47 style System Tests and the service provider in question MUST provide 
their own System Test environment with public access of status for the 
community and commit to maintaining those for as long as the Provider is 
released by the community.

I think this is a very reasonable ask for Cloudera (and anyone else in the 

Re: [PROPOSAL] Dealing with public runner test failues (Integration tests restructuring)

2022-12-07 Thread Oliveira, Niko
Awesome to hear this!

I was really battling this issue last week, very excited for these 
improvements, let me know if I can help.

Cheers,
Niko


From: Jarek Potiuk 
Sent: Tuesday, December 6, 2022 5:54:07 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [PROPOSAL] Dealing with public runner test failues 
(Integration tests restructuring)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hey everyone,

I think many contributors (non-committers) started to suffer from
often failing (disappearing) test runs (mostly for sqlite).

Together with @Taragolis, we looked at those recent stability issues
with "public runners". They all boil down to the integration tests
taking too much memory.

Example screenshot from a debug run that I run when trying to "catch
the problem in the act" with debugging enabled is attached. Seems that
just before such failure we had just 55 M (out of 7G available in the
public runners) - just before the runner "disappeared". Looks like the
writing is on the wall.

There are two ways we will be addressing it shortly (unless someone
objects or have more/ other ideas to improve it):

1. Improving the ways how integration tests are structured and running

* We will reorganize our integration tests to be (similar to system
tests) in a separate subfolder of the "tests' ' - this will allow for
easier discovery and a better structured approach to all integration
tests.

* We will STOP running integration tests in regular test jobs of ours.
Instead we will introduce a separate "Integration Test" job that will
run only integration tests and that will run the integrations
``one-by-one" - i.e. we will not be starting kerberos, mongo, redis
all together, but will only start minimal set of integrations needed
for the tests that are using them

2. Arranging for bigger public runners

I am discussing - in the Apache Infrastructure meetings - (next
meeting is on Wednesday) using more powerful Public runners. This is
possible, and we just need to make sure INFRA/Apache is not overusing
the free runners the Apache Software Foundation gets as a generous
sponsorship from GitHub. This might actually vastly decrease the
feedback time you get as non-committers as we can get up to 4x times
faster builds this way.

J.


Re: Proposal to Remove Executor Coupling in Core Airlfow Code Base

2022-12-06 Thread Oliveira, Niko
Hey Robert and Utkarsh!

Thanks for your interest!

The tasks which have a smaller dev estimate (the ones marked with S, see the 
README for more explanation) would be better starter tasks since they're 
smaller in scope and should just need one maybe two PRs.


Any task that's in the Backlog column and does not have an assignee yet is up 
for grabs. There are two Small (S) tasks left, you each can grab one of them, 
just leave a comment on the task you're interested in and I'll assign it to you 


I'm on Slack as well, please follow-up there if you need anything I'm happy to 
help and coach through the changes!

Cheers,
Niko


From: Robert Karish 
Sent: Monday, December 5, 2022 7:56 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Proposal to Remove Executor Coupling in Core Airlfow 
Code Base


Hey Niko,

I am also a new contributor to Airflow, I’ve only made a few small 
contributions so far, and I am more of a user (for work) than a contributor. I 
would like to get involved in this AIP as well because I think this is a great 
improvement to the project. I’ve looked over the Github Projects page that 
you’ve made, but I’m not sure yet which available task would be best for me. I 
will look into these more closely over the next few days. I am active on our 
Slack channel so if you have any suggestions on a good starter task for this 
AIP that I could tackle in my spare time let me know on there.

Best,
Robert

On Mon, Dec 5, 2022 at 10:21 PM Utkarsh Sharma 
 wrote:
Hi Niko,

I'm new to airflow's codebase, but would very much like to work with you on 
this AIP.  Please let me know how can I help you.

Thanks,
Utkarsh Sharma


On Tue, Dec 6, 2022 at 1:58 AM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
Cool!

On Mon, Dec 5, 2022 at 9:16 PM Oliveira, Niko  
wrote:

Hey folks!

As a follow-up, if you're interested in following along with this project or 
even taking some tasks, I've created a dashboard using Github's new Projects 
tool. You can see the backlog of tasks, who's assigned, the estimated size and 
priority of the task and it's current state: 
https://github.com/orgs/apache/projects/162/views/7?layout=board

NOTE: See the README (button in the top right corner) for a more detailed 
explanation of the priority and estimate fields (it's frustrating how hidden 
the README is for this Projects tool...)


This will be a live board, new tasks will be added as they come up, but the 
general skeleton is there.


Cheers,
Niko



From: Oliveira, Niko 
Sent: Monday, November 7, 2022 4:41:44 PM
To: dev@airflow.apache.org<mailto:dev@airflow.apache.org>
Subject: RE: [EXTERNAL]Proposal to Remove Executor Coupling in Core Airlfow 
Code Base


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hey folks!

I went ahead and wrote an AIP for this proposal. It can be found here:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow

Please leave any feedback here or in the Confluence comments.

Thanks for your time!

____
From: Oliveira, Niko 
Sent: Monday, October 24, 2022 2:28 PM
To: dev@airflow.apache.org<mailto:dev@airflow.apache.org>
Subject: [EXTERNAL] Proposal to Remove Executor Coupling in Core Airlfow Code 
Base



Hey all!

Recently I have spent some time investigating the occurrences of hardcoded 
Executor logic within core Airflow code and put together a mini-AIP of sorts on 
Github Discussions (it was nice to use GH markdown and automatic code snippets).

I'm particularly interested to hear if folks think an AIP would be reasonable 
for this set of changes or if the community is fine with using Discussions 
alone and beginning development without an AIP.

https://github.com/apache/airflow/discussions/27241

Thanks for you time!



Re: Proposal to Remove Executor Coupling in Core Airlfow Code Base

2022-12-05 Thread Oliveira, Niko
Hey folks!

As a follow-up, if you're interested in following along with this project or 
even taking some tasks, I've created a dashboard using Github's new Projects 
tool. You can see the backlog of tasks, who's assigned, the estimated size and 
priority of the task and it's current state: 
https://github.com/orgs/apache/projects/162/views/7?layout=board

NOTE: See the README (button in the top right corner) for a more detailed 
explanation of the priority and estimate fields (it's frustrating how hidden 
the README is for this Projects tool...)


This will be a live board, new tasks will be added as they come up, but the 
general skeleton is there.


Cheers,
Niko



From: Oliveira, Niko 
Sent: Monday, November 7, 2022 4:41:44 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Proposal to Remove Executor Coupling in Core Airlfow 
Code Base


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hey folks!

I went ahead and wrote an AIP for this proposal. It can be found here:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow

Please leave any feedback here or in the Confluence comments.

Thanks for your time!


From: Oliveira, Niko 
Sent: Monday, October 24, 2022 2:28 PM
To: dev@airflow.apache.org
Subject: [EXTERNAL] Proposal to Remove Executor Coupling in Core Airlfow Code 
Base



Hey all!

Recently I have spent some time investigating the occurrences of hardcoded 
Executor logic within core Airflow code and put together a mini-AIP of sorts on 
Github Discussions (it was nice to use GH markdown and automatic code snippets).

I'm particularly interested to hear if folks think an AIP would be reasonable 
for this set of changes or if the community is fine with using Discussions 
alone and beginning development without an AIP.

https://github.com/apache/airflow/discussions/27241

Thanks for you time!



Re: [DISCUSSION] Assessing what is a breaking change for Airflow (SemVer context)

2022-12-05 Thread Oliveira, Niko
1) users "peace of mind" as top priority: clarity of what they can
expect from Airflow, and avoiding surprises when upgrading
2) targeting minimal disruption to user's workflows (though we might
never reach absolute 100%)
3) making it easy for contributors and maintainers to decide on
breaking/non-breaking behaviours

Yupp, I agree, this is an accurate encapsulation of the issues at hand.


My proposal  to work on documenting our approach for our users (and
for maintainers) in a single page: "What is Airflow Public API?" and
what users can expect.

I think this is actually a very important piece we've been missing. From the 
SemVer RFC itself it says:


"For this system to work, you first need to declare a public API. This may 
consist of documentation or be enforced by the code itself. Regardless, it is 
important that this API be clear and precise. Once you identify your public 
API, you communicate changes to it with specific increments to your version 
number."

So as difficult as I think it will be to accurately describe and automate what 
the Airflow public API is, I think it's a very useful project to undertake. 
Perhaps even codifying it in an AIP.
At the moment we consider even the deepest/smallest "private" helper function 
within util provider code to be public. This level of public API makes 
iterating and maintaining the code very laborious. So I definitely think this 
is worth the effort.
I'll need to have a closer look at that PR, but the exact technical details can 
certainly be hammered out later.

Cheers,
Niko


From: Jarek Potiuk 
Sent: Saturday, December 3, 2022 1:25 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSSION] Assessing what is a breaking change for 
Airflow (SemVer context)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Sorry for not following up on this for a bit - it's been hectic these
days for me. I think valid points were said, and from the tone of
those I feel that we all who participated have the same sense of what
is important:

1) users "peace of mind" as top priority: clarity of what they can
expect from Airflow, and avoiding surprises when upgrading
2) targeting minimal disruption to user's workflows (though we might
never reach absolute 100%)
3) making it easy for contributors and maintainers to decide on
breaking/non-breaking behaviours

I think there is a main blocker to all of those (also mentioned in the
discussion above):

We are extremely cautious about any change because there is a lack of
agreement/expectations with our users on what is supposed to be the
"public API" .

# Proposal

My proposal  to work on documenting our approach for our users (and
for maintainers) in a single page: "What is Airflow Public API?" and
what users can expect.

There are certain areas where we can define rules and either automate
or document (or both) our statement about what is the "public" API and
(more importantly) what is clearly NOT on a single page document.
Also it should also be accompanied (where possible) with some
automation and tooling that would help us to express it in detail (and
help our users to validate if they are conforming to the "public
API").

We won't solve it very quickly, but once we start doing it, it might
turn out that it's not that long of a process in fact. And if we start
it now - in a few months we might be in a different place.

# Some concrete actions we might take

1) On the 'Code" level - we can start to define the API that is
considered as "public" and add verification of those for our users. We
could implement a similar solution to what I proposed to common.sql
https://github.com/apache/airflow/pull/27962 (where I followed Ash's
idea to use MyPy stubgen and pre-commits to flag changes to it, and
where we harness MyPy capabilities to control how the API is used). I
believe that we could apply a similar solution to all providers and
eventually even all parts of core, to make it very clear which part of
the Airflow API is public and which is not. I think MyPy and
strong-ish typing is taking the Python world by a storm, and we could
use it as a standard way of communicating to those who use Airflow as
a library, which parts are "public".

Having .pyi files as part of our packages with "hidden" parts that are
not supported to be exposed, seems to be not only a nice communication
tool but also has support for all the kind of tooling from day 0 for
our users (IDE integrations, automations to check if the right API is
used etc.). We could even easily provide guidelines for the users
"Here is how you can check if you are using Airflow code properly".
Not 100% foolproof but much better than anything else I can imagine.

Also having it in place will allow the providers to be finally
separated to separate repositories - and we could use MyPy checks
rather than running the full test 

Re: [ANNOUNCE] New committer Andrey Anshin (Taragolis)

2022-12-02 Thread Oliveira, Niko
Congrats Andrey!


From: Jarek Potiuk 
Sent: Thursday, December 1, 2022 11:29:14 PM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [ANNOUNCE] New committer Andrey Anshin (Taragolis)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hello Airflow Community,

I am happy to announce that the Project Management Committee (PMC) for
Apache Airflow
has invited Andrey Anshin (github nickname Taragolis) to become a
committer and we are pleased
to announce that they have accepted.

Congratulations Andrey, and welcome!

Jarek on behalf of Airflow PMC


Re: [DISCUSSION] Assessing what is a breaking change for Airflow (SemVer context)

2022-11-22 Thread Oliveira, Niko
Thanks for starting this discussion Jarek!

I think it's very important for us to get on the same page as a community about 
this.

I'd love to go with a more flexible/common sense approach for considering 
breaking changes, and in a perfect world I think this would be best. However, I 
also think it will be very hard to document a policy that is clear enough to 
make the decision making for each case straightforward (we're also missing data 
to make those customer impact assessments too).

Suggestion 1 from Bolke is actually quite an appealing approach to me. The 
advantage I love is that it allows us to remain quite strict on SemVer which is 
a huge benefit. Since being very strict means that the decision-making is easy 
and keeps the overhead low (gets you out of the "discussions, discussions, 
discussions" loop). Since, again, it will be harder to otherwise document a 
common-sense based policy that covers all the edge cases clearly enough (like 
ones which have already been brought up in this thread).


Cheers,
Niko



From: Bolke de Bruin 
Sent: Tuesday, November 22, 2022 5:51:32 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][DISCUSSION] Assessing what is a breaking change for 
Airflow (SemVer context)


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I beg to differ too and I do think that Alexander is right in what he wants to 
accomplish. Large installations do want to do rolling upgrades and not bring a 
cluster down for an upgrade. It should be possible to run Airflow Core 2.4 with 
for example 2.3 workers. It should however not happen through reliance on the 
DB schema but through a (non-public) API specification. This requires 
significant architectural improvements. It means decoupling workers from the 
DB, but also Tasks. It requires backoff periods and strict versioning on the 
API. It would really improve security (availability, integrity). It is 
partially addressed in one of the AIPs but imho not sufficiently.

Now Airflow is just a monolith packaged in PODs or microservices.

Something for the future (hopefully :-) ).

B.





On 22 November 2022 at 12:40:07, Abhishek Bhakat 
(abhishek.bha...@astronomer.io.invalid)
 wrote:

Hi,

I Beg to differ with Alexander and agree with Jarek. There are multiple ways to 
deploy Airflow. Mostly commonly used is docker images, in that case using one 
image for all components is standard practice. If using native pip 
installations, airflow components are launched by a single pip module. So, to 
have different versions of components (as you mentioned) is adding extra work 
just to keep them out of sync. A basic common sense would be not to take extra 
steps to self sabotage.

Thanks,
Abhishek

On 22-Nov-2022 at 4:35:09 PM, Alexander Shorin 
mailto:kxe...@gmail.com>> wrote:
On Tue, Nov 22, 2022 at 1:37 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
BTW. "Workers from 2.2" used with "Airflow 2.4" is not even a thing.
This is something that you should never, ever, try to do.
This is even more common sense, and there are of course limits of what
you can describe in the docs (whatever you come up with, someone might
have a super crazy idea that you have not thought about and - for
example - run Airflow 1.10 worker With Airflow 2 (why not? We have not
written it should not happen).

At scale, you cannot upgrade all the versions and keep them in sync all the 
time. For minor versions compatibility is expected. Obviously, it doesn't for 
major one. It is common sense and practice in the real world, sorry.

--
,,,^..^,,,




[RESULT][VOTE] AIP-51 Removing Executor Coupling from Core Airlfow

2022-11-21 Thread Oliveira, Niko
Hey folks!

The voting for AIP-51 Removing Executor Coupling from Core Airlfow  
(https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow)
 was completed on November 21, 2022, and I am happy announce the following 
voting result:

*Binding (+6) Votes*
Jarek Potuik
Elad Kalif
Jeambrun Pierre
Tomasz Urbaszek
Ping Zhang
Xiaodong Deng


*Non-binding (+5) Votes*
Darren Weber
Igor Kholopov
Mocheng Guo
Robert Karish
Pankaj Singh

I would like to thank all the above who participated in this voting!
Link to the vote thread: 
https://lists.apache.org/thread/p0jpt7b6qrmfhboqxhq3718xxh1z0ow


[VOTE] AIP-51 - Removing Executor Coupling from Core Airlfow

2022-11-15 Thread Oliveira, Niko
Hey folks!

I would like to start a vote for "AIP-51 - Removing Executor Coupling from Core 
Airlfow".

You can find the AIP here:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow

Discussion threads:
https://lists.apache.org/thread/yd4tcgyqft664dt8clgx48gyvfzddnjh
https://github.com/apache/airflow/discussions/27241

The voting will last for 6 days (until 21th of November 2022, 2:00pm PST), and 
until at least 3 binding votes have been cast.

Please vote accordingly:

[ ] + 1 approve
[ ] + 0 no opinion
[ ] - 1 disapprove with the reason

Only votes from PMC members and committers are binding, but other members of 
the community are encouraged to check the AIP and vote with "(non-binding)".

Thanks!




Re: Proposal to Remove Executor Coupling in Core Airlfow Code Base

2022-11-08 Thread Oliveira, Niko
Thanks for the review Jarek!

> My only ask will be to split it into smaller, independent PRs when it
> gets implemented, so that we can asses better the consequences of some
> changes

Yupp, totally agree. I plan to create a tracking Issue that breaks down the AIP 
into chunks (linked child Issues). Probably one issue/PR per coupling type 
mentioned in the doc.

Cheers,
Niko


From: Jarek Potiuk 
Sent: Tuesday, November 8, 2022 3:52:15 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Proposal to Remove Executor Coupling in Core Airlfow 
Code Base

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



I really like it, and for me it's no brainer - we've been casually
discussing that we need to do that one sooner rather than later and I
like the attention to detail and the analysis done by Niko.

My only ask will be to split it into smaller, independent PRs when it
gets implemented, so that we can asses better the consequences of some
changes (and whether we have to potentially account for backwards
compatibility for users who implemented their own executors (not a
very popular one also because of those problems this AIP is aiming to
solve, but still we have to account for that).

I personally think if there are no objections, this one is ready to
start voting on.

J.

On Tue, Nov 8, 2022 at 1:42 AM Oliveira, Niko
 wrote:
>
> Hey folks!
>
> I went ahead and wrote an AIP for this proposal. It can be found here:
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow
>
>
> Please leave any feedback here or in the Confluence comments.
>
> Thanks for your time!
>
> ____
> From: Oliveira, Niko 
> Sent: Monday, October 24, 2022 2:28 PM
> To: dev@airflow.apache.org
> Subject: [EXTERNAL] Proposal to Remove Executor Coupling in Core Airlfow Code 
> Base
>
>
> Hey all!
>
> Recently I have spent some time investigating the occurrences of hardcoded 
> Executor logic within core Airflow code and put together a mini-AIP of sorts 
> on Github Discussions (it was nice to use GH markdown and automatic code 
> snippets).
>
> I'm particularly interested to hear if folks think an AIP would be reasonable 
> for this set of changes or if the community is fine with using Discussions 
> alone and beginning development without an AIP.
>
> https://github.com/apache/airflow/discussions/27241
>
> Thanks for you time!
>


Re: Proposal to Remove Executor Coupling in Core Airlfow Code Base

2022-11-07 Thread Oliveira, Niko
Hey folks!

I went ahead and wrote an AIP for this proposal. It can be found here:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow

Please leave any feedback here or in the Confluence comments.

Thanks for your time!


From: Oliveira, Niko 
Sent: Monday, October 24, 2022 2:28 PM
To: dev@airflow.apache.org
Subject: [EXTERNAL] Proposal to Remove Executor Coupling in Core Airlfow Code 
Base



Hey all!

Recently I have spent some time investigating the occurrences of hardcoded 
Executor logic within core Airflow code and put together a mini-AIP of sorts on 
Github Discussions (it was nice to use GH markdown and automatic code snippets).

I'm particularly interested to hear if folks think an AIP would be reasonable 
for this set of changes or if the community is fine with using Discussions 
alone and beginning development without an AIP.

https://github.com/apache/airflow/discussions/27241

Thanks for you time!



Re: [PROPOSAL] Clarifications of triage team role including strenghtening importance of active triaging

2022-10-31 Thread Oliveira, Niko
in the previous description we jumped straight to "labels" but
I believe (and that might be a good point to discuss with the current
triagers and committers) that labeling of priorities/etc. should be
somewhat "last" point of the triaging. We rarely (if at all) look at
those labels and IMHO there are many things triager can do long before
assigning the labels - closing an issue, @-mentioning others,
assigning "good first issue" label,  asking for extra information (and
assigning "pending-response" label), converting to a discussion,
assigning next patchlevel release as a milestone - those are all the
actions that are very well embedded in our process and they will
guarantee an almost-automated follow-up and that the issue will be
eventually "acted upon". Those actions I mentioned are taken from
actual actions we've been taking in practice.

These are exactly the types of things I've found myself doing while triaging 
and I agree they are massively more helpful than just adding labels alone. And 
speaking which, to your question, IMHO the labels are best used for issue 
discovery and sorting. Folks can use them to find open issues they'd like to 
work on in a particular area, which I think is helpful, especially around 
providers. I find myself doing this, but I have no data to prove others are 
also doing it. Overall, I don't feel too strongly, if the majority opinion is 
that they're just adding noise, then I'd be happy to vote to get rid of them.

Cheers,
Niko


From: Jarek Potiuk 
Sent: Saturday, October 29, 2022 5:26:13 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][PROPOSAL] Clarifications of triage team role including 
strenghtening importance of active triaging

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Thanks to everyone who added their comments in
https://github.com/apache/airflow/pull/27262 - a lot of the comments
were super valuable and helpful and I re-worked the description a bit
based on them so you might want to take another look.

I've modified and extended the description a bit to clarify few things:

1) I've added clearer role of GitHub Discussions - and added some
clearer criteria on where I believe issues should be converted to
discussions (I think this is a very important part of the issue
triaging process that basically very well filters out the real issues
from just discussions and it can make our issue triaging process much
more efficient

2) I also updated a "feature" issue template to clarify that the
potential scope of the feature should be "small" and that you need to
start a GiHub Discussion and likely create an AIP if you want to
propose a "big" feature. It might be loosely related, but I noticed we
do not even tell the users what "feature" issue scope should be. I
think it makes it easy for triagers to convert such a "feature"
request into a discussion if the user had a chance to read that they
should do that if unsure if the scope of the proposal is small enough
for "feature". Yes - they might not read it, but if we explain that in
the template, this is far less of a contention if we convert such an
issue in the discussion because the user was actually instructed to do
so themselves if unsure.

3) I added much clearer description of the actions that triagers can
take - in the previous description we jumped straight to "labels" but
I believe (and that might be a good point to discuss with the current
triagers and committers) that labeling of priorities/etc. should be
somewhat "last" point of the triaging. We rarely (if at all) look at
those labels and IMHO there are many things triager can do long before
assigning the labels - closing an issue, @-mentioning others,
assigning "good first issue" label,  asking for extra information (and
assigning "pending-response" label), converting to a discussion,
assigning next patchlevel release as a milestone - those are all the
actions that are very well embedded in our process and they will
guarantee an almost-automated follow-up and that the issue will be
eventually "acted upon". Those actions I mentioned are taken from
actual actions we've been taking in practice.  I (and others) have
been successfully doing this for years and this already helped a lot
in our issue handling process. What I put there merely explicitly
lists the actions we've been taking - it's nothing new or unexpected.
And putting them down is actually a good reference and if users are
asking why we are doing it, we can point them to that description.

4) **QUESTION**: I have however one area that I am not so sure about
and wanted to ask everyone for their opinion:

I see a great value in assigning "good first issue" and "pending
response" labels. However, assigning the "kind", "priority", "area"
and "providers'', "affected_version" labels on issues is very
different. Those are more "statistics" 

Re: [PROPOSAL] Clarifications of triage team role including strenghtening importance of active triaging

2022-10-25 Thread Oliveira, Niko
Thanks Jarek for the extra context and additions to the Traiging rst!



An update from my end as a new Triage member:


For the past few weeks I have looked at every single notification from Airflow 
Github and it has been a very informative learning experience.


1. Beyond tagging and updating Issues you gain exposure to far more user 
concerns, questions and PRs than you otherwise would. This has been massively 
helpful to be more aware of all the work going on in Airflow and where gaps are 
for our users. It rounds you out as a community member in Airflow.


2. You are also given the opportunity to shepherd in more contributors to the 
project. It feels particularly good to help someone who filed an issue to get 
involved in the community by submitting a PR to solve their issue, and then 
following through with reviewing it, and so on. Seeing that process through 
from beginning to end is very rewarding!


3. I am also getting a true sense of just how overwhelming the influx of 
Issues, PRs and Discussions is. I have come across several folks who submitted 
PRs and never got feedback and then left the community. Losing these folks is a 
bad experience for them but also for us because we lost perhaps a great future 
contributor/committer/triager. We certainly need all the help we can get on 
this front, for reviewing, providing feedback and ultimately merging folks' 
contributions.


I truly appreciate all the time the committers and other community members put 
into this side of Airflow; it's tedious and time consuming work, and is often 
under appreciated!


Cheers,
Niko



From: Jarek Potiuk 
Sent: Tuesday, October 25, 2022 6:50 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [PROPOSAL] Clarifications of triage team role including 
strenghtening importance of active triaging


Hey everyone,

I think with the recent addition of Denis and Niko to the triage team we 
already see some good signs that we should likely make the triage team stronger 
and also incentivised to help :).

I think this is instrumental to be more welcoming for our community to respond 
quicker to issues and discussions (even if that response is "no we won't do it 
because" or "converting to discussion because")  and I think having a committed 
(pun intended) triage team that helps committers in this task is crucial.

We already mentioned "issue triaging" in the in our criteria to become 
commiter, but I think it was not too clear 
https://github.com/apache/airflow/blob/main/COMMITTERS.rst#community-involvement.
 You could interpret it differently I think (building issue triaging process? 
Making sure it works? Doing triaging yourself? What are the expectations when 
you are part of the triage team?).

I attempted to describe it a bit better and explain what are the expectations 
from the triage team and why actively helping to triage issues (and 
discussions!) is actually a good way to become a committer. Clarifying it might 
also help some other people to step up and ask to be added (or even us - 
maintainers - to invite some of the contributors to the triage team) - because 
they will realise this is really good way on their road to committership

I based it quite a bit on my own actions and experiences - I think actively 
helping our users by responding to their issues (which I think should also 
become more explicit part of expectations for the triage team) is one of the 
fastest and best ways for me to learn the parts of code I am not actively 
contributing to (on top of actually helping our users)

I prepared a draft, proposal PR that attempts to explains our approach, 
expectations and strengthens a bit the importance of active triaging in the 
road to committership:

https://github.com/apache/airflow/pull/27262

I'd love comments on that one - either here or in the PR. Would love to hear 
what others think.

J.



Proposal to Remove Executor Coupling in Core Airlfow Code Base

2022-10-24 Thread Oliveira, Niko
Hey all!

Recently I have spent some time investigating the occurrences of hardcoded 
Executor logic within core Airflow code and put together a mini-AIP of sorts on 
Github Discussions (it was nice to use GH markdown and automatic code snippets).

I'm particularly interested to hear if folks think an AIP would be reasonable 
for this set of changes or if the community is fine with using Discussions 
alone and beginning development without an AIP.

https://github.com/apache/airflow/discussions/27241

Thanks for you time!



Re: Github Issue Triaging

2022-10-05 Thread Oliveira, Niko
> @Niko - You can start helping even before you get added to that, by just 
> tagging Elad, myself or other Triage team members (until you get into triage 
> team) on the issues you feel are not-active so that we can close it.

Yupp, sounds good, will do! I appreciate the support Kaxil.

Cheers,
Niko



From: Kaxil Naik 
Sent: Wednesday, October 5, 2022 10:16 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Github Issue Triaging


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Once we have the list and if there are no objections from PMCs I'll open a Jira 
to Infra to update grant access for triage role.
No objections from me, the more the merrier.

@Niko - You can start helping even before you get added to that, by just 
tagging Elad, myself or other Triage team members (until you get into triage 
team) on the issues you feel are not-active so that we can close it.

Other things are:

  *   Reminding assignees if they are still working on the issues
  *   Or add "pending response" where the issue creator hasn't replied

Regards,
Kaxil

On Wed, 5 Oct 2022 at 17:32, Oliveira, Niko  wrote:

Hey Elad!

Yupp, I read through that rst and I'm going to help triage incoming issues more 
frequently. Though I do think reviewing the backlog of issues every quarter or 
so can be a useful exercise as well.

Consider this my request to be added to the airflow-triage Team!

Cheers,
Niko


From: Elad Kalif mailto:elad...@apache.org>>
Sent: Wednesday, October 5, 2022 9:01:22 AM
To: dev@airflow.apache.org<mailto:dev@airflow.apache.org>
Subject: RE: [EXTERNAL]Github Issue Triaging


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


The backlog is actually in good shape. Most of the open issues just need 
attention from a knowledgeable contributor in the specific area to add some 
pointers if the issue is valid or not.

The protocol for triage can be found in: 
https://github.com/apache/airflow/blob/main/ISSUE_TRIAGE_PROCESS.rst
We also have #issue-triage slack channel where we raise 
concerns/questions/focus areas.

Contributors can get triage access if we add them to 
https://github.com/orgs/apache/teams/airflow-triage team (It gives Github 
Triage role access https://infra.apache.org/github-roles.html )
I'd be happy to work with 2-3 contributors on this. If others are interested 
please let us know.
Once we have the list and if there are no objections from PMCs I'll open a Jira 
to Infra to update grant access for triage role.

Noting - regardless of triage privileges I encourage everyone to assist us by 
simply commenting in issues. The act of close/set labels is not time consuming, 
the real problem is actually handling the issues.



On Wed, Oct 5, 2022 at 6:15 PM Oliveira, Niko  
wrote:

Hello folks,

Yesterday I attended a session at ApacheCon about best practices for managing 
bug/issue backlogs for a project and it got me reflecting on the Airflow issue 
backlog. I'd like to get more involved in initial (and continuous) triage of 
Airflow issues on Github.

I chatted with Jarek after the session (yay, in-person events are back!) and he 
mentioned that there is a mechanism available to give non-committers the 
ability to modify/update Issue tags, assignees, etc on Gihub while triaging 
(though not the ability to merge, of course).

If something like this exists, is anyone willing to add me to it? If it doesn't 
exist, is anyone willing to collaborate with me to setup something like this 
up? :)

Cheers,
Niko


Re: Github Issue Triaging

2022-10-05 Thread Oliveira, Niko
Hey Elad!

Yupp, I read through that rst and I'm going to help triage incoming issues more 
frequently. Though I do think reviewing the backlog of issues every quarter or 
so can be a useful exercise as well.

Consider this my request to be added to the airflow-triage Team!

Cheers,
Niko


From: Elad Kalif 
Sent: Wednesday, October 5, 2022 9:01:22 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Github Issue Triaging


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


The backlog is actually in good shape. Most of the open issues just need 
attention from a knowledgeable contributor in the specific area to add some 
pointers if the issue is valid or not.

The protocol for triage can be found in: 
https://github.com/apache/airflow/blob/main/ISSUE_TRIAGE_PROCESS.rst
We also have #issue-triage slack channel where we raise 
concerns/questions/focus areas.

Contributors can get triage access if we add them to 
https://github.com/orgs/apache/teams/airflow-triage team (It gives Github 
Triage role access https://infra.apache.org/github-roles.html )
I'd be happy to work with 2-3 contributors on this. If others are interested 
please let us know.
Once we have the list and if there are no objections from PMCs I'll open a Jira 
to Infra to update grant access for triage role.

Noting - regardless of triage privileges I encourage everyone to assist us by 
simply commenting in issues. The act of close/set labels is not time consuming, 
the real problem is actually handling the issues.



On Wed, Oct 5, 2022 at 6:15 PM Oliveira, Niko  
wrote:

Hello folks,

Yesterday I attended a session at ApacheCon about best practices for managing 
bug/issue backlogs for a project and it got me reflecting on the Airflow issue 
backlog. I'd like to get more involved in initial (and continuous) triage of 
Airflow issues on Github.

I chatted with Jarek after the session (yay, in-person events are back!) and he 
mentioned that there is a mechanism available to give non-committers the 
ability to modify/update Issue tags, assignees, etc on Gihub while triaging 
(though not the ability to merge, of course).

If something like this exists, is anyone willing to add me to it? If it doesn't 
exist, is anyone willing to collaborate with me to setup something like this 
up? :)

Cheers,
Niko


Github Issue Triaging

2022-10-05 Thread Oliveira, Niko
Hello folks,

Yesterday I attended a session at ApacheCon about best practices for managing 
bug/issue backlogs for a project and it got me reflecting on the Airflow issue 
backlog. I'd like to get more involved in initial (and continuous) triage of 
Airflow issues on Github.

I chatted with Jarek after the session (yay, in-person events are back!) and he 
mentioned that there is a mechanism available to give non-committers the 
ability to modify/update Issue tags, assignees, etc on Gihub while triaging 
(though not the ability to merge, of course).

If something like this exists, is anyone willing to add me to it? If it doesn't 
exist, is anyone willing to collaborate with me to setup something like this 
up? :)

Cheers,
Niko


Re: Vending AWS System Test Results Back to the Community

2022-08-19 Thread Oliveira, Niko
 While the 
CloudFormation scripts would be best if published, I think it will be far more 
efficient to get it in the hands of those stakeholders who are mostly 
interested in getting the "green" tests. That is much more scalable solution 
from the community point of view. We are not going to publish it to our users 
and is not really needed to be run on our infra. I don't see a particular need 
for regular community members to even know how/what infrastructure is used to 
run the tests - the test execution is pretty standardised, and I think we are 
really interested in output rather than the infra to run it.

J.




On Fri, Aug 19, 2022 at 2:53 PM Kamil Breguła 
mailto:dzaku...@gmail.com>> wrote:
I don't think we have to limit ourselves that only the commiters have access to 
the Amazon account managed by Airflow community. In the past, commiters was 
supported by other people whom they trust e.g. commiter asked for help from 
another co-worker from her company when he needed it.

This means that there are no restrictions on Amazon employees using this 
account and maintaining this environment.

We just have to be careful that no-commiters have not write permission to the 
repository, and that they cannot publish a new version of the application that 
can be seen as official released by the Apache Foundation.

On Fri, Aug 19, 2022, 01:30 Oliveira, Niko  wrote:

Hey folks,


Those of us on the AWS Airflow team (myself, Dennis F, Vincent B, Seyed H) have 
been working on a few projects over the past few months:


1. Writing example dags/docs for all existing Operators in the AWS Airflow 
provider package (done)

2. Writing AWS specific logic in Airflow codebase to support AIP-47 (done)

3. Converting all example dags to AIP-47 compliant system tests (just over 
halfway done)


All of these are ultimately culminating to the goal of us running these system 
tests at a regular cadence within Amazon (where we have access to funded AWS 
accounts). We will run these system tests, triggered by updates to 
airflow:main, at least once a day.

I'd like to open a discussion on how we can vend these results back to the 
community in a way that is most consumable for contributors, release managers 
and users alike.

A quick and easy approach would be to create a publicly viewable CloudWatch 
Dashboard. With at least the following metrics for each system test over time:  
pass/fail, duration, and execution count.
This would be a human readable way to consume the current status of AWS 
Operators.


If a more machine readable format is required/preferred (e.g. for scripts 
related to Airflow release management perhaps) we could also put together a 
simple API Gateway endpoint that would vend the data in a format such as JSON.

Another interesting option would be for us to publish the CloudFormation 
templates (or the codebase used to generate the templates) for configuring the 
system test environment and executing the tests. This could be deployed to an 
AWS account owned and managed by the Airflow community where tests would be run 
periodically. AWS has provided some credits in the past which could be used to 
help fund the account. But this introduces a large component that would need 
ownership and management by folks within the Airflow community who have access 
to such AWS accounts and credits (likely only committers/release managers?). So 
it might not be worth the complexity.


I'd like to hear what folks think!

Cheers,
Niko





Vending AWS System Test Results Back to the Community

2022-08-18 Thread Oliveira, Niko
Hey folks,


Those of us on the AWS Airflow team (myself, Dennis F, Vincent B, Seyed H) have 
been working on a few projects over the past few months:


1. Writing example dags/docs for all existing Operators in the AWS Airflow 
provider package (done)

2. Writing AWS specific logic in Airflow codebase to support AIP-47 (done)

3. Converting all example dags to AIP-47 compliant system tests (just over 
halfway done)


All of these are ultimately culminating to the goal of us running these system 
tests at a regular cadence within Amazon (where we have access to funded AWS 
accounts). We will run these system tests, triggered by updates to 
airflow:main, at least once a day.

I'd like to open a discussion on how we can vend these results back to the 
community in a way that is most consumable for contributors, release managers 
and users alike.

A quick and easy approach would be to create a publicly viewable CloudWatch 
Dashboard. With at least the following metrics for each system test over time:  
pass/fail, duration, and execution count.
This would be a human readable way to consume the current status of AWS 
Operators.


If a more machine readable format is required/preferred (e.g. for scripts 
related to Airflow release management perhaps) we could also put together a 
simple API Gateway endpoint that would vend the data in a format such as JSON.

Another interesting option would be for us to publish the CloudFormation 
templates (or the codebase used to generate the templates) for configuring the 
system test environment and executing the tests. This could be deployed to an 
AWS account owned and managed by the Airflow community where tests would be run 
periodically. AWS has provided some credits in the past which could be used to 
help fund the account. But this introduces a large component that would need 
ownership and management by folks within the Airflow community who have access 
to such AWS accounts and credits (likely only committers/release managers?). So 
it might not be worth the complexity.


I'd like to hear what folks think!

Cheers,
Niko





Re: [PROPOSAL] Provider's mixed governance model - first step of provider separation

2022-06-22 Thread Oliveira, Niko
+1 I agree this is a logical next step towards possibly separating provider 
code from the Airflow code base (and it's useful even if we never do that).

Cheers,
Niko


From: Kamil Breguła 
Sent: Monday, June 20, 2022 1:30 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][PROPOSAL] Provider's mixed governance model - first 
step of provider separation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I discussed this problem with Jarek. The group of stakeholders will be open and 
everyone can join, not just Google employees, etc. Release branches will be 
maintained in apache/airflow repository. Any non-committer change will still 
require PR. This means there is no vendor neutrality risk.

+1 I think this is a good step forward.

pon., 20 cze 2022 o 21:18 Jarek Potiuk 
mailto:ja...@potiuk.com>> napisał(a):
BTW. It will also be possible for anyone in the community to cherry-pick 
changes from main and make a PR (which also a committer will have to approve 
and merge). This is really no different that we have already done with 
cherry-picked commits to "v1-10-stable" and "v-2-3-stable" branches by 
non-committers. Random example here:https://github.com/apache/airflow/pull/14090

We do not give any privileges to the organisations. Quite the opposite - we 
make them responsible for preparing the PRs to be reviewed by committers.

J.

On Mon, Jun 20, 2022 at 9:07 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
> I think we should continue to be strictly vendor-neutral. No organization 
> should be able to gain special privileges or control a project’s direction.

This is strictly vendor-neutral - Kamil - we are going to release the same 
changes that we are releasing already in main providers, just selectively 
cherry-picked (and then reviewed and merged by committer to the branch in 
airflow repo) - why do you think it is non-vendor neutral?

On Mon, Jun 20, 2022 at 9:04 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
> We can keep these branches in forks managed by stakeholders teams, but I am 
> afraid of the benefit that it will be then copied by us to our repository and 
> then released by us. If the release was prepared by an external team, I think 
> we should make it clear that it was prepared by another team, including by 
> publishing on the Pypi account of the team that dealt with it.

> Yes this is exactly what I proposed.

Correction. I misread it.

We are going to merge - only the cherry-picked changes that have been reviewed 
and merged by the committer. Same way as today. Just the process of 
cherry-picks is going to be done by the stakeholders, selecting the things to 
cherry-pick (all the changes to cherry-pick should be already merged in main). 
What we are going to do is to release subset of the changes we already approved 
(and released - because we are going to release those changes in the latest 
provider). So there will be no "new" changes in those forks - those will be 
just cherry-picked changes reviewed by the committer. There is no reason to 
mark it as "other" code - it will be the same changes that are going to be 
released anyway (just a subset of those).

J.

On Mon, Jun 20, 2022 at 9:00 PM Jarek Potiuk 
mailto:ja...@potiuk.com>> wrote:
> Cherry-picking to branch v2-2* or 1.10.* can only be done by the committers, 
> because only they have write permission to the apache/airflow repository. As 
> far as I know, Github does not allow us to grant write-only permissions to 
> the selected branch.

Kamil - you misunderstood it. The branch will be in the FORK of those users's 
choice - not in airflow repo. committer will merge that branch in the same way 
we do as today - but with fast-forwarding rather than squashing.

> We can keep these branches in forks managed by stakeholders teams, but I am 
> afraid of the benefit that it will be then copied by us to our repository and 
> then released by us. If the release was prepared by an external team, I think 
> we should make it clear that it was prepared by another team, including by 
> publishing on the Pypi account of the team that dealt with it.

Yes this is exactly what I proposed.

> I think that everything Apache PMC releases should be prepared and created 
> fully within the apache / airflow repositories. If stakeholders team do not 
> have such a possibility, we should figure out that these teams become part of 
> the community, and therefore work together with the entire community, not in 
> isolation. Only then will we be able to act in accordance with the Apache 
> Way, in particular each individual 
> person will be able to contribute to the community as an individual, and not 
> as a company or stakeholders team (Community of Peers) and no person will get 
> special privileges just on the basis of their employment status 

Re: Testing structure of your provider code

2022-05-09 Thread Oliveira, Niko
> I hope we can - as part of system tests improvements - add it for other 
> providers too :)


I plan to add the tests for AWS soon :)


From: Jarek Potiuk 
Sent: Sunday, May 8, 2022 12:34 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL]Testing structure of your provider code


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I love it. It makes it more usable for other providers too. Merged it now after 
some reviews.

I hope we can - as part of system tests improvements - add it for other 
providers too :)

On Fri, Apr 29, 2022 at 9:26 AM Bartlomiej Hirsz 
mailto:bartek.hi...@gmail.com>> wrote:
Hi,
When I was working on AIP-47 (new design of system tests) I noticed that Google 
has tests for coverage of operators with examples.
I revived it a bit to also work with new system tests, but I wasn't fully 
satisfied. I have decided to rewrite it so it can be easily reused by other 
providers. As a bonus it now has pluggable architecture. Here is PR:
https://github.com/apache/airflow/pull/23351

Any provider that wishes to test for the coverage of their examples/system 
tests or test anything else in the structure of their operators can reuse this 
code (create their own classes in the file and inherit from base classes). In 
the case of Google I have created two kinds of tests: example coverage and also 
extra assets coverage. I also did it for elasticsearch and docker providers to 
show how it's possible to extend it for other providers.

Check out the PR and let me know what you think and if there is room for 
improvements :)

Regards,
Bartlomiej Hirsz


Re: Code ownership over the provider's source code

2022-04-11 Thread Oliveira, Niko
> BTW. Seems that there is a problem that you STILL need write access to be 
> CODEOWNER (despite the updated documentation :) )

Sad, I was also very excited about this. It would make the workflow of keeping 
an eye on changes to code we "own" much easier. Fingers crossed it's a feature 
coming soon!


From: Jarek Potiuk 
Sent: Monday, April 11, 2022 3:39 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] Code ownership over the provider's source code


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


BTW. Seems that there is a problem that you STILL need write access to be 
CODEOWNER (despite the updated documentation :) ) 
https://github.com/apache/airflow/pull/22903

I hope it's just "future documentation" :)  and this feature is being released 
now (or maybe just the error message is outdated) 

I raised a support ticket for that to GitHub (you can't see it - but I put Elad 
and Bartłomiej on CC:) https://support.github.com/ticket/personal/0/1581140

Let's see!

J.

On Mon, Apr 11, 2022 at 12:29 PM Bartłomiej Hirsz 
mailto:bartek.hi...@gmail.com>> wrote:
Yes, you worded it perfectly - I don't intend to be the owner of the code (and 
in any way gatekeeping it) but merely improve how we're notifying on code 
changes (or assigning to code review). I listed my PR as an example but it was 
a question in general - where I see that there could be people interested to 
see that particular code is about to be changed by PR and would be easy to miss 
in big projects without automation such as codeowner file. In that particular 
case I would just point to the new AIP and help to migrate the PR (and I 
wouldn't be able to vetoed if of course, just provide review + help with 
comments). I love that part of OSS where others contribute to the code and 
we're all owners of it in a broad sense so I don't want to lose it :)

pon., 11 kwi 2022 o 12:20 Jarek Potiuk 
mailto:ja...@potiuk.com>> napisał(a):
I think you are both right and both wrong - Elad and Bartłomiej :)

Elad is quite right when it comes to "gating" changes. You might get notified 
about a change and raise your concerns but the nature of ASF project is simple 
- committers decide whether a change is ready. Any commiter  (and commiter 
only) can veto a code change if it is justified 
(https://www.apache.org/foundation/voting.html#votes-on-code-modification).
The non-committer cannot veto a change. So you cannot (and will not have) 
decide on what can be merged or not if you are not a commiter. This is what 
commiter status gives - and it gives it for all code in the whole Apache 
project (and it cannot be - by definition only for parts of the code). There 
are no (and will never be) partial committers or "less access committers" for 
an Apache project. I am quite certain of it (as I discussed this very thing at 
the OSS Backstage in Berlin with a few ASF old-timers). So comitter is 
"all-or-nothing".

However there is absolutely nothing wrong (and this is even encouraged) that 
non-committers review and provide opinions and helpful comments and raise 
concerns. But - and I believe this is what Bartek asked for -  they have to be 
notified somehow about the change for relevant parts of the code they are 
interested in. I can imagine tracking "all changes" is impossible so having a 
nice tool to do it automatically is great.

But also Bartłomiej - you are wrong that you have to wait for the code owner. 
Even being in CODEOWNER does not imply it. We don't do it even now when we have 
committers being CODEOWNERS - we do not wait for those CODEOWNERS who are 
defined for parts of the code when we merge stuff - not even for the core. 
Sometimes we reach out when we need someones opinion (if we know that this 
person is more knowledgeable in this area, but it's rarely is a "blocking" call.
 The ONLY real gate for merging the code is any committer's approval. We have a 
rule that two committers need to agree on "core" change additionally (but this 
is more of a gentleman's agreement, rather than enforced rule so far). And I 
think we cannot and should not change it. It would be nice to wait for an 
opinion from someone who is CODEOWNER, but this is just "nice to have". By the 
rules of ASF if you are not a committer, you cannot decide for the code to be 
merged nor veto it. You can give advice and have opinions and concerns but you 
cannot decide.

And CODEOWNERS as of recently seems to allow that and make non-committers 
CODEOWNERS (which I really like actually). I think the name is a little 
misleading. CODEOWNER does not give you more rights when it comes to make code 
ready to be merged, or having a veto. It's merely "you are by default assigned 
to be a reviewer of that code". It does not mean (contrary to the name) that 
you are OWNER of that code. It means that you are a stakeholder who is 

Re: New Commiter: Malthe Borch

2022-02-22 Thread Oliveira, Niko
Congrats Malthe, well deserved!


From: Vikram Koka 
Sent: Monday, February 21, 2022 2:08 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] New Commiter: Malthe Borch


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Congratulations and welcome Malthe!

On Sat, Feb 19, 2022 at 9:22 AM Daniel Standish 
 wrote:
Congrats Malthe!


Re: New committer: Josh Fell

2022-02-22 Thread Oliveira, Niko
Congrats Josh, well deserved!


From: Vikram Koka 
Sent: Monday, February 21, 2022 2:07 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] New committer: Josh Fell


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Congratulations and Welcome Josh!

On Sat, Feb 19, 2022 at 9:21 AM Daniel Standish 
 wrote:
Congrats!

On Sat, Feb 19, 2022 at 5:11 AM Kaxil Naik 
mailto:kaxiln...@gmail.com>> wrote:
Congratulations Josh, welcome aboard

On Sat, 19 Feb 2022 at 07:38, Ephraim Anierobi  
wrote:
Congratulations Josh!

On Sat, 19 Feb 2022 at 08:37, Tomasz Urbaszek 
mailto:turbas...@apache.org>> wrote:
Congrats Josh!


Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

2022-02-09 Thread Oliveira, Niko
Hey folks,


I think both these sticking points are really a trade-off of simplicity vs 
consistency/reliability. And to be clear I'm not arguing for things to be more 
complex just for the heck of it, I agree that simplicity is great! But just 
that there needs to be a balance and we can't get caught over-indexing on one 
or the other. I think the combination of test environments being a free for all 
and tests being simply a set of guidelines with some static analysis both will 
combine to be brittle. The example Mateusz just described regarding around 
needing a watcher task to ensure tests end with the right result is a great 
example of how the route of kludging example dags themselves to be the test and 
the test runner can be brittle and complicated. And again, I love the idea of 
the example dags being the code under test, I just think having them also 
conduct the test execution of themselves is going to be troublesome.


But as always, if I'm the only one worried about this, I'm happy to disagree 
and commit and see how it goes :)


Cheers,
Niko


From: Jarek Potiuk 
Sent: Sunday, February 6, 2022 8:52 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I think Mateusz explained the points very well. I have just a few comments to 
some of the points.

> 3. In general the AIP reads as if it's solved this problem, but it's more 
> like it has absolved itself from solving this problem, which is much 
> different. I think this approach could possibly make things even worse as now 
> there is no contract or interface for how to plumb configuration and 
> credentials to the system test dags. The current set of methods and files to 
> plumb credentials through aren't great (and as of now are quite Google 
> specific) but I think this interface can be simplified and improved rather 
> than just exported wholesale for each provider to re-invent a new approach.

We've discussed it extensively with Mateusz (I was also of the opinion that we 
could do some automation here). For example wec could write a "terraform" 
script that creates the whole environment - set up all the service accounts 
etc. But Mateusz convinced me it will be very hard to "mandate" a common way of 
doing it for multiple "services" or "groups" of services. My proposal is that 
we should be clear in the AIP/framework that we don't solve it in a "common 
way". But instead we keep a "service-specific" way of preparing the 
environment. We might automate it - in a service-specific way, but having it as 
part of the system tests is I think out-of-scope. In a way we currently have it 
already with our "regular" tests. To build the AMI to run our self-hosted 
runners, we have a separate repo: https://github.com/apache/airflow-ci-infra/ - 
where we have "packer" scripts which build our image. We even tried Terraform 
there, but well - packer is "good enough". And we can have separate 
"airflow-system-tests-aws" repo and "airlfow-system-tests-gcp" repo, where we 
will - separately document and possibly automate how to build such a "runner"

> 4.  A system that relies on good intentions like "be sure to remember to do X 
> otherwise bad thing Y will happen" certainly guaranties that bad thing Y will 
> happen, frequently. Humans are fallible and not great at sticking to a 
> contract or interface that isn't codified. And this AIP is littered with 
> statements like this. We need test infrastructure that's easier to use and 
> will also enforce these best practices/requirements which are needed for the 
> tests to run.

Here - I wholeheartedly agree with Mateusz. This is GREAT simplification to 
have one example file doing everything. The previous approach we had was 
extremely complex - you had scripts, pytest tests, example dags and they were 
providing (meta)data to each other and it was not only hard to reason about 
them but also hard to verify that they are "ok". The idea of just making it one 
file is great. And the amount of "be sure" is not only small but it actually 
can be very easily enforced by pre-commits. We could make sure that our 
"example_dags" in a certain provider contain the (few) lines that need to be 
copied among them - having a common "header" and "footer" on example dag is 
super-simple pre-commit to write. We also discussed some other approaches and I 
think it is really powerful that the scripts can be run as "pytest" tests and 
as "standalone" scripts with SequentialDebugger. The level of control it gives 
for manual runs and the level of automation it provides by tapping into some of 
the great pytest features (parallel runs, status, flakiness, timeouts, and 
plethora of other plugins) - all of that makes it great to run multiple tests 
in the CI 

Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

2022-01-25 Thread Oliveira, Niko
Hey folks,


Very excited about this AIP!

I work on the AWS OSS Airflow team. Getting the AWS system tests running has 
been a pet-project of mine for the past little while. I come from a test 
automation background, so this is dear to my heart :)


Currently I have a branch that contains the implementation of the various 
methods (mostly around environment and credential configuration) for AWS system 
tests, but I was running into very obscure runtime issues. So I'm glad to see 
one of the goals of this AIP is to simplify this process, since it is 
convoluted and clunky.


Here are some high level thoughts after my read through the AIP:


1. “CI integration can be built using GitHub CI or provider-related solution“ - 
I'm happy (and excited) to collaborate on this front! I already have a proof of 
concept running (internally to AWS) built with AWS Code Pipeline and Code 
Build, which runs various tests each time my personal fork is updated from 
upstream. But we could easily make something like this public and then connect 
it to the Airflow repo to verify commits, PRs, etc.


2. After reading through the AIP, I still don't think I truly grok the 
SYSTEM_TESTS_ENV_ID environment variable. I understand its use, particularly 
for uniqueness, but who is the owner for setting this? What precisely does it 
represent? Including an example of an actual value for that env variable in the 
AIP would be helpful!

3. The AIP reads: “Maintaining system tests requires knowledge about breeze, 
pytest and maintenance of special files like variables.env or credential files. 
Without those, we will simplify the architecture and improve management over 
tests.”
The AIP talks a lot about removing this type of setup from the Airflow 
system test platform to simplify things and that "All needed permissions to 
external services for execution of DAGs (tests) should be provided to the 
Airflow instance in advance." and that "Each provider should create an 
instruction explaining how to prepare the environment to run related system 
tests so that users can do it on their own".
   In general the AIP reads as if it's solved this problem, but it's more like 
it has absolved itself from solving this problem, which is much different. I 
think this approach could possibly make things even worse as now there is no 
contract or interface for how to plumb configuration and credentials to the 
system test dags. The current set of methods and files to plumb credentials 
through aren't great (and as of now are quite Google specific) but I think this 
interface can be simplified and improved rather than just exported wholesale 
for each provider to re-invent a new approach.


4. Lastly, regarding the actual construction of the tests themselves, the 
proposed design is full of statements like:
   - "If a teardown task(s) has been defined, remember to add 
trigger_rule="all_done" parameter to the operator call. This will make sure 
that this task will always run even if the upstream fails"
   - “Make sure to include these parameters into DAG call: ...“

   - “Change Airflow Executor to DebugExecutor by placing this line at the top 
of the file: ...”

   - “Try to keep tasks in the DAG body defined in an order of execution.”


   A system that relies on good intentions like "be sure to remember to do X 
otherwise bad thing Y will happen" certainly guaranties that bad thing Y will 
happen, frequently. Humans are fallible and not great at sticking to a contract 
or interface that isn't codified. And this AIP is littered with statements like 
this. We need test infrastructure that's easier to use and will also enforce 
these best practices/requirements which are needed for the tests to run.
In general, it reads much more like a guideline on best practices rather than a 
new and improved system test engine.


Thanks for taking the time to create this AIP I'm very eager to get system 
testing up and running in Airflow and I'd love to collaborate further on it!

Cheers,
Niko



From: Jarek Potiuk 
Sent: Tuesday, January 25, 2022 8:57 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Just let me add a bit more context as I see it.

The AIP-47 design (following the AIP-4 idea) is targeted to handle those things:

* streamline the execution of the System Tests (which are basically executable 
"example_dags" of ours which we already have in our documentation for providers)
* make them integrated and automatically execute in a "cloud" environment - 
with real cloud services behind
* make use of the 100s of System Tests already written in Google Provider (and 
many others which we have  in in other providers - for example AWS should be 
close to follow)
* test automation of those using 

Re: [DISCUSS] Shaping the future of executors for Airflow (slowly phasing out Celery ?)

2021-11-25 Thread Oliveira, Niko
> We could even likely think about
adding more options of similar kind for GCP/AWS/Azure - using native
capabilities of those platforms rather than using generic "Kubernetes"
as remote execution. I can imagine using Fargate (AWS team could
contribute it ), Cloud Run (Google team), Azure Container Instances
(maybe Microsoft will finally also embrace Airflow :) ) .  That would
make the Airflow architecture more "Multiple Cloud Native".

>From the AWS side we're very interested and happy to work on something like a 
>Fargate executor; it's on our roadmap either way.

But I think a generalized "cloud" or "serverless" executor would make a lot of 
sense. From AWS alone you may want to execute "small" tasks within a Lambda 
(quick start up time but small amount of compute and a 15min max run time) and 
then "medium" to "large" tasks in ECS Fargate or Batch (with longer startup 
times but more compute available), etc. And the same goes for other cloud 
provider equivalents. A harmonized and configurable solution could make 
directing tasks to different execution environments very smooth.
 

From: Jarek Potiuk 
Sent: Thursday, November 25, 2021 2:40 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] [DISCUSS] Shaping the future of executors for Airflow 
(slowly phasing out Celery ?)

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hello Everyone,

I recently had some discussions and thought about some new features
implemented already and planned and in-progress work, and I had a
thought - that maybe worth discussing here.

It's very likely many of the people involved had similar discussion
and thoughts, but maybe it's worth spelling it out now and have a
common "direction" we are heading for the future of airflow when it
comes to executors.

TL;DR; I think the recent changes and possibly some future
improvements and optimisation can lead us to the situation that we
will not need Celery Executor (nor CeleryKubernetes)  and can phase it
out eventually - leaving only Local, Kubernetes and soon coming
LocalKubernetes one. We might still "support" CeleryExecutor for
backwards compatibility and people who do not want to run Kubernetes,
but in a way the main reasons why Celery would be preferred over
Kubernetes should be gone soon IMHO.

Why do I think so ?

I think so because I believe the main problems of having
CeleryExecutor in the first place are largely gone. The main reason
why Celery executor was better than the Kubernetes one was that you
could run more short tasks with far less overhead and latency. However
we have now either already implemented or easy to optimise ways of
significantly decreasing the need of running small tasks via "remote"
executors.

The following things already happened:

1) We have Deferrable Operators support. Most of the code there - for
mostly small tasks or parts of the operators that wait for something
already executed in triggerer for those.

2) We have a HA scheduler where you could run multiple schedulers with
Local Executor - thus you can get scalability in LocalExecutor for
small tasks.

3) We had some optimisations in DummyOperator where triggering is done
in Scheduler.

What still can (or is being already done):

* While triggerer does not (I believe) support multiple instances for
now, it has been designed from ground up to support HA/scalability.

* We can rewrite a lot of the operators we have to be Deferrable -
especially those that reach out to external services.

* We can make more "built-in" operators that have some declarative
behaviour rather than imperative "execute" and have them evaluated
directly in Scheduler. We had a discussion about it in
https://github.com/apache/airflow/pull/19361 - but looks like it
should be possible to implement - for example - "DayOfWeek" operator
that would be evaluated in Scheduler and triggering decisions could be
made there. We could probably add quite a number of such "optimized"
operators that could be declarative and evaluated in a scheduler with
virtually 0 overhead.

* with LocalKubernetes executor coming
https://github.com/apache/airflow/pull/19729 combined with
HA/scalability of scheduler (thus scalability of Local Executors) - It
seems that any reasonable installation will have enough scalability
and capacity to locally execute all the remaining "small tasks" in
Local Executors. We could even try to figure out some good pattern of
figuring out which tasks are "small" and automatically using
LocalExecutor for them - eventually.

It seems to me that with those upcoming changes, LocalKubernetes
should be default executor in the future rather than Celery (which is
now kind-of de facto "default"). We could even likly think about
adding more options of similar kind for GCP/AWS/Azure - using native
capabilities of those platforms rather than using generic "Kubernetes"
as 

Re: OOM issue in the CI

2021-11-09 Thread Oliveira, Niko
Hey all,

Just to throw another data point in the ring, I've had a 
PR stuck in the same way as well. 
Several retries are all failing with the same OOM.


I've also dug through the Github Actions history and found a few others. So it 
doesn't seem to be just a one-off.


Cheers,
Niko


From: Khalid Mammadov 
Sent: Tuesday, November 9, 2021 6:24 AM
To: dev@airflow.apache.org
Subject: [EXTERNAL] OOM issue in the CI


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi Devs,

I have been working on below PR for and run into OOM issue during testing on 
GitHub actions (you can see in commit history).

https://github.com/apache/airflow/pull/19139/files

The tests for databases Postgres, MySQL etc. fails due to OOM and docker gets 
killed.

I have reduced parallelism to 1 "in the code" temporarily (the only extra 
change in the PR) and it passes all the checks which confirms the issue.


I was hoping if you could advise the best course of action in this situation so 
I can force parallelism to 1 to get all checks passed or some other way to 
solve OOM?

Any help would be appreciated.


Thanks in advance

Khalid