Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Daniel Standish
I wonder if we could generalize to summarize what network activity there
was on dag parse more generally.  e.g. if there were API calls or whatnot.

Quick google search yields this
https://thepythoncode.com/article/make-a-network-usage-monitor-in-python
but not sure it could be used for this.  Intuitively it seems possible.

On Fri, Jun 14, 2024 at 1:45 PM Jarek Potiuk  wrote:

> Agree with Kaxil. Last run is.good. Actually if we are looking for Airflow
> 3, the target for 'readu for migration' will be just '0 everywhere' .
> Having last run and getting totals and seeing it goes to 0 steadily should
> be the 'ready for migration' target for anyone running Airflow 3.
>
> pt., 14 cze 2024, 22:36 użytkownik Kaxil Naik 
> napisał:
>
> > Yeah, it is 30s by default; storing it in DB would be useless, apart from
> > the most recent run, in my opinion.
> >
> > On Fri, 14 Jun 2024 at 15:24, Constance Martineau
> >  wrote:
> >
> > > I love the idea. If we were to store it in the DB, would we keep a
> > history,
> > > or only the latest stats from the most recent dag parsing loop? DAG
> > parsing
> > > by default is every 30s right?
> > >
> > > On Fri, Jun 14, 2024 at 6:53 AM Jarek Potiuk  wrote:
> > >
> > > > > I think we still need to enable the ability for DAGs at parse time
> to
> > > > access Variables.
> > > >
> > > > It's actually DISCOURAGED ;) to access Variables at parse time -
> though
> > > we
> > > > have an experimental feature to make it efficient and we discussed
> > > whether
> > > > to treat it as "good practice" and there was strong opposition to
> that
> > > 
> > > > So I am not actually sure what will be our position on it.
> > > >
> > > > On Fri, Jun 14, 2024 at 12:50 PM Ash Berlin-Taylor 
> > > wrote:
> > > >
> > > > >
> > > > >
> > > > > > On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> > > > > >
> > > > > >  think in the future of Airflow 3 where
> > > > > > we will have task isolation, having `0` for all the DAGs will be
> a
> > > > > > prerequisite for switching to "task isolation" mode and this
> could
> > be
> > > > > > actually verified in a migration tool.
> > > > >
> > > > > I think we still need to enable the ability for DAGs at parse time
> to
> > > > > access Variables.
> > > > >
> > > > > Or at least I am not proposing we remove that ability. (I wouldn’t
> be
> > > > > against it, but I was planning on continuing to support that for
> now)
> > > > >
> > > > > -ash
> > > >
> > >
> >
>


[VOTE] Release Apache Airflow Helm Chart 1.14.0 based on 1.14.0rc1

2024-06-14 Thread Jed Cunningham
Hello Apache Airflow Community,

This is a call for the vote to release Helm Chart version 1.14.0.

The release candidate is available at:
https://dist.apache.org/repos/dist/dev/airflow/helm-chart/1.14.0rc1/

airflow-chart-1.14.0-source.tar.gz - is the "main source release" that
comes with INSTALL instructions.
airflow-1.14.0.tgz - is the binary Helm Chart release.

Public keys are available at: https://www.apache.org/dist/airflow/KEYS

For convenience "index.yaml" has been uploaded (though excluded from
voting), so you can also run the below commands.

helm repo add apache-airflow-dev
https://dist.apache.org/repos/dist/dev/airflow/helm-chart/1.14.0rc1/
helm repo update
helm install airflow apache-airflow-dev/airflow

airflow-1.14.0.tgz.prov - is also uploaded for verifying Chart Integrity,
though not strictly required for releasing the artifact based on ASF
Guidelines.

$ helm gpg verify airflow-1.14.0.tgz
gpg: Signature made Fri Jun 14 14:41:31 2024 MDT
gpg:using RSA key E1A1E984F55B8F280BD9CBA20BB7163892A2E48E
gpg:issuer "jedcunning...@apache.org"
gpg: Good signature from "Jed Cunningham "
[ultimate]
plugin: Chart SHA verified.
sha256:206d7eae00697bfd2fbe896b58adf9d66f928f70ee75a07e5772be86c9ed6185

The vote will be open for at least 72 hours (2024-06-17 20:59 UTC) or until
the necessary number of votes is reached.

https://www.timeanddate.com/countdown/to?iso=20240617T2059&p0=136&font=cursive

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason

Only votes from PMC members are binding, but members of the community are
encouraged to test the release and vote with "(non-binding)".

Consider this my (binding) +1.

For license checks, the .rat-excludes files is included, so you can run the
following to verify licenses (just update your path to rat):

tar -xvf airflow-chart-1.14.0-source.tar.gz
cd airflow-chart-1.14.0
java -jar apache-rat-0.13.jar chart -E .rat-excludes

Please note that the version number excludes the `rcX` string, so it's now
simply 1.14.0. This will allow us to rename the artifact without modifying
the artifact checksums when we actually release it.

The status of testing the Helm Chart by the community is kept here:
https://github.com/apache/airflow/issues/40248

Thanks,
Jed


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Jarek Potiuk
Agree with Kaxil. Last run is.good. Actually if we are looking for Airflow
3, the target for 'readu for migration' will be just '0 everywhere' .
Having last run and getting totals and seeing it goes to 0 steadily should
be the 'ready for migration' target for anyone running Airflow 3.

pt., 14 cze 2024, 22:36 użytkownik Kaxil Naik  napisał:

> Yeah, it is 30s by default; storing it in DB would be useless, apart from
> the most recent run, in my opinion.
>
> On Fri, 14 Jun 2024 at 15:24, Constance Martineau
>  wrote:
>
> > I love the idea. If we were to store it in the DB, would we keep a
> history,
> > or only the latest stats from the most recent dag parsing loop? DAG
> parsing
> > by default is every 30s right?
> >
> > On Fri, Jun 14, 2024 at 6:53 AM Jarek Potiuk  wrote:
> >
> > > > I think we still need to enable the ability for DAGs at parse time to
> > > access Variables.
> > >
> > > It's actually DISCOURAGED ;) to access Variables at parse time - though
> > we
> > > have an experimental feature to make it efficient and we discussed
> > whether
> > > to treat it as "good practice" and there was strong opposition to that
> > 
> > > So I am not actually sure what will be our position on it.
> > >
> > > On Fri, Jun 14, 2024 at 12:50 PM Ash Berlin-Taylor 
> > wrote:
> > >
> > > >
> > > >
> > > > > On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> > > > >
> > > > >  think in the future of Airflow 3 where
> > > > > we will have task isolation, having `0` for all the DAGs will be a
> > > > > prerequisite for switching to "task isolation" mode and this could
> be
> > > > > actually verified in a migration tool.
> > > >
> > > > I think we still need to enable the ability for DAGs at parse time to
> > > > access Variables.
> > > >
> > > > Or at least I am not proposing we remove that ability. (I wouldn’t be
> > > > against it, but I was planning on continuing to support that for now)
> > > >
> > > > -ash
> > >
> >
>


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Kaxil Naik
Yeah, it is 30s by default; storing it in DB would be useless, apart from
the most recent run, in my opinion.

On Fri, 14 Jun 2024 at 15:24, Constance Martineau
 wrote:

> I love the idea. If we were to store it in the DB, would we keep a history,
> or only the latest stats from the most recent dag parsing loop? DAG parsing
> by default is every 30s right?
>
> On Fri, Jun 14, 2024 at 6:53 AM Jarek Potiuk  wrote:
>
> > > I think we still need to enable the ability for DAGs at parse time to
> > access Variables.
> >
> > It's actually DISCOURAGED ;) to access Variables at parse time - though
> we
> > have an experimental feature to make it efficient and we discussed
> whether
> > to treat it as "good practice" and there was strong opposition to that
> 
> > So I am not actually sure what will be our position on it.
> >
> > On Fri, Jun 14, 2024 at 12:50 PM Ash Berlin-Taylor 
> wrote:
> >
> > >
> > >
> > > > On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> > > >
> > > >  think in the future of Airflow 3 where
> > > > we will have task isolation, having `0` for all the DAGs will be a
> > > > prerequisite for switching to "task isolation" mode and this could be
> > > > actually verified in a migration tool.
> > >
> > > I think we still need to enable the ability for DAGs at parse time to
> > > access Variables.
> > >
> > > Or at least I am not proposing we remove that ability. (I wouldn’t be
> > > against it, but I was planning on continuing to support that for now)
> > >
> > > -ash
> >
>


Re: [DISCUSS] Proposal around the Injection of Task Execution Secrets

2024-06-14 Thread Jarek Potiuk
First pass done - especially around security aspects of it, Looks great.

On Fri, Jun 14, 2024 at 2:55 PM Ash Berlin-Taylor  wrote:

> I’ve written up a lot more of the implementation details into an AIP
> https://cwiki.apache.org/confluence/x/xgmTEg
>
> It’s still marked as Draft/Work In Progress for now as there are few
> details we know we need to cover before the doc is complete.
>
> (There was also some discussion in the dev call about a different name for
> this AIP)
>
> > On 7 Jun 2024, at 19:25, Ash Berlin-Taylor  wrote:
> >
> >> IMHO - if we do not want to support DB access at all from workers,
> > triggerrers and DAG file processors, we should replace the current "DB"
> > bound interface with a new one specifically designed for this
> > bi-directional direct communication Executor <-> Workers,
> >
> > That is exactly what I was thinking too (both that no DB should be the
> only option in v3, and that we need a bidirectional purpose designed
> interface) and am working up the details.
> >
> > One of the key features of this will be giving each task try a "strong
> identity" that the API server can use to identify and trust the requests,
> likely some form of signed JWT.
> >
> > I just need to finish off some other work before I can move over to
> focus Airflow fully
> >
> > -a
> >
> > On 7 June 2024 18:01:56 BST, Jarek Potiuk  wrote:
> >> I added some comments here and I think there is one big thing  that
> should
> >> be clarified when we get to "task isolation" - mainly dependance of it
> on
> >> AIP-44.
> >>
> >> The Internal gRPC API (AIP-44) was only designed in the way it was
> designed
> >> to allow using the same codebase to be used with/without DB. It's based
> on
> >> the assumption that a limited set of changes will be needed (that was
> >> underestimated) in order to support both DB and GRPC ways of
> communication
> >> between workers/triggerers/DAG file processors at the same time. That
> was a
> >> basic assumption for AIP-44 - that we will want to keep both ways and
> >> maximum backwards compatibility (including "pull" model of worker
> getting
> >> connections, variables, and updating task state in the Airflow DB). We
> are
> >> still using "DB" as a way to communicate between those components and
> this
> >> does not change with AIP-44.
> >>
> >> But for Airflow 3 the whole context is changed. If we go with the
> >> assumption that Airflow 3 will only have isolated tasks and no DB
> "option",
> >> I personally think using AIP-44 for that is a mistake. AIP-44 is merely
> a
> >> wrapper over existing DB calls designed to be kept updated together with
> >> the DB code, and the whole synchronisation of state, heartbeats,
> variables
> >> and connection access still uses the same "DB communication" model and
> >> there is basically no way we can get it more scalable this way. We will
> >> still have the same limitations on the DB - where a number of DB
> >> connections will be replaced with a number of GRPC connections,
> Essentially
> >> - more scalability and performance has never been the goal of AIP-44-
> all
> >> the assumptions are that it only brings isolation but nothing more will
> >> change. So I think it does not address some of the fundamental problems
> >> stated in this "isolation" document.
> >>
> >> Essentially AIP-44 merely exposes a small-ish number of methods (bigger
> >> than initially anticipated) but it only wraps around the existing DB
> >> mechanism. Essentially from the performance and scalability point of
> view -
> >> we do not get much more than currently when using pgbouncer. This one
> >> essentially turns a big number of connections coming from workers into a
> >> smaller number of pooled connections that pgbounder manages internal and
> >> multiplexes the calls over. With the difference that unlike AIP-44
> Internal
> >> API server, pgbouncer does not limit the operations you can do from the
> >> worker/triggerer/dag file processor - that's the main difference between
> >> using pgbouncer and using our own Internal-API server.
> >>
> >> IMHO - if we do not want to support DB access at all from workers,
> >> triggerrers and DAG file processors, we should replace the current "DB"
> >> bound interface with a new one specifically designed for this
> >> bi-directional direct communication Executor <-> Workers, more in line
> with
> >> what Jens described in AIP-69 (and for example WebSocket and
> asynchronous
> >> communication comes immediately to my mind if I did not have to use DB
> for
> >> that communication). This is also why I put the AIP-67 on hold because
> IF
> >> we go that direction that we have "new" interface between worker,
> triggerer
> >> , dag file processor - it might be way easier (and safer) to introduce
> >> multi-team in Airflow 3 rather than 2 (or we can implement it
> differently
> >> in Airflow 2 and differently in Airflow 3).
> >>
> >>
> >>
> >> On Tue, Jun 4, 2024 at 3:58 PM Vikram Koka  >
> >> wrote:
> >>
> >>> Fellow Airflowers

Re: Proposal for Enhanced Data Awareness in Airflow

2024-06-14 Thread Jarek Potiuk
First pass done. I think it's a great direction for Dataset -> Asset but I
think clarifying with some examples how Threshold and Tolerance would look
like, plus I am a bit unclear on how partitioning works and I think also
some examples would be useful there

On Thu, Jun 13, 2024 at 2:49 PM Constance Martineau
 wrote:

> Hi Airflow Dev Community!
>
> I am excited to share a new proposal written by TP and I titled "Enhanced
> Data Awareness in Airflow
> <
> https://docs.google.com/document/d/1Sra65yjbAIZ2mZIbSUL9YMPrW73ltDEPWTCD4J3j2hQ/edit#heading=h.f9eh19p4yqfw
> >"
> that I believe will significantly advance our capabilities in data
> orchestration.
>
> The proposal aims to bridge the gap between task management and data
> management within Airflow integrating enhanced data awareness features.
> This evolution unlocks Airflow's ability to make informed orchestration
> decisions based on actual data that is produced/manipulated by Airflow and
> provide actionable insights about the data as it moves through workflows,
> ultimately improving data reliability and data quality.
>
> Key highlights of the proposal include:
>
>- *Introducing Assets:* Redefining datasets as assets, allowing for more
>comprehensive data management and better alignment with modern data
>engineering practices.
>- *Progressive Adoptability:* Ensuring that enhancements can be
>integrated incrementally without disrupting existing workflows.
>- *Handling Incremental Load Strategies:* Providing first-class support
>for incremental processes to provide visibility on data freshness, set
> the
>stage for targeted backfills, and ultimately improve data reliability
>
> For more details, please refer to the attached document. I am eager to hear
> your thoughts and feedback on this proposal, as well as any suggestions for
> improvement. We will follow up with a set of formal AIPs.
>
> Constance
> --
>
> Constance Martineau
>
> Senior Product Manager
>
> Email: consta...@astronomer.io
>
> Time zone: US Eastern (EST UTC-5 / EDT UTC-4)
>


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Constance Martineau
I love the idea. If we were to store it in the DB, would we keep a history,
or only the latest stats from the most recent dag parsing loop? DAG parsing
by default is every 30s right?

On Fri, Jun 14, 2024 at 6:53 AM Jarek Potiuk  wrote:

> > I think we still need to enable the ability for DAGs at parse time to
> access Variables.
>
> It's actually DISCOURAGED ;) to access Variables at parse time - though we
> have an experimental feature to make it efficient and we discussed whether
> to treat it as "good practice" and there was strong opposition to that 
> So I am not actually sure what will be our position on it.
>
> On Fri, Jun 14, 2024 at 12:50 PM Ash Berlin-Taylor  wrote:
>
> >
> >
> > > On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> > >
> > >  think in the future of Airflow 3 where
> > > we will have task isolation, having `0` for all the DAGs will be a
> > > prerequisite for switching to "task isolation" mode and this could be
> > > actually verified in a migration tool.
> >
> > I think we still need to enable the ability for DAGs at parse time to
> > access Variables.
> >
> > Or at least I am not proposing we remove that ability. (I wouldn’t be
> > against it, but I was planning on continuing to support that for now)
> >
> > -ash
>


Re: [DISCUSS] Proposal around the Injection of Task Execution Secrets

2024-06-14 Thread Ash Berlin-Taylor
I’ve written up a lot more of the implementation details into an AIP 
https://cwiki.apache.org/confluence/x/xgmTEg

It’s still marked as Draft/Work In Progress for now as there are few details we 
know we need to cover before the doc is complete.

(There was also some discussion in the dev call about a different name for this 
AIP)

> On 7 Jun 2024, at 19:25, Ash Berlin-Taylor  wrote:
> 
>> IMHO - if we do not want to support DB access at all from workers,
> triggerrers and DAG file processors, we should replace the current "DB"
> bound interface with a new one specifically designed for this
> bi-directional direct communication Executor <-> Workers, 
> 
> That is exactly what I was thinking too (both that no DB should be the only 
> option in v3, and that we need a bidirectional purpose designed interface) 
> and am working up the details.
> 
> One of the key features of this will be giving each task try a "strong 
> identity" that the API server can use to identify and trust the requests, 
> likely some form of signed JWT.
> 
> I just need to finish off some other work before I can move over to focus 
> Airflow fully
> 
> -a
> 
> On 7 June 2024 18:01:56 BST, Jarek Potiuk  wrote:
>> I added some comments here and I think there is one big thing  that should
>> be clarified when we get to "task isolation" - mainly dependance of it on
>> AIP-44.
>> 
>> The Internal gRPC API (AIP-44) was only designed in the way it was designed
>> to allow using the same codebase to be used with/without DB. It's based on
>> the assumption that a limited set of changes will be needed (that was
>> underestimated) in order to support both DB and GRPC ways of communication
>> between workers/triggerers/DAG file processors at the same time. That was a
>> basic assumption for AIP-44 - that we will want to keep both ways and
>> maximum backwards compatibility (including "pull" model of worker getting
>> connections, variables, and updating task state in the Airflow DB). We are
>> still using "DB" as a way to communicate between those components and this
>> does not change with AIP-44.
>> 
>> But for Airflow 3 the whole context is changed. If we go with the
>> assumption that Airflow 3 will only have isolated tasks and no DB "option",
>> I personally think using AIP-44 for that is a mistake. AIP-44 is merely a
>> wrapper over existing DB calls designed to be kept updated together with
>> the DB code, and the whole synchronisation of state, heartbeats, variables
>> and connection access still uses the same "DB communication" model and
>> there is basically no way we can get it more scalable this way. We will
>> still have the same limitations on the DB - where a number of DB
>> connections will be replaced with a number of GRPC connections, Essentially
>> - more scalability and performance has never been the goal of AIP-44- all
>> the assumptions are that it only brings isolation but nothing more will
>> change. So I think it does not address some of the fundamental problems
>> stated in this "isolation" document.
>> 
>> Essentially AIP-44 merely exposes a small-ish number of methods (bigger
>> than initially anticipated) but it only wraps around the existing DB
>> mechanism. Essentially from the performance and scalability point of view -
>> we do not get much more than currently when using pgbouncer. This one
>> essentially turns a big number of connections coming from workers into a
>> smaller number of pooled connections that pgbounder manages internal and
>> multiplexes the calls over. With the difference that unlike AIP-44 Internal
>> API server, pgbouncer does not limit the operations you can do from the
>> worker/triggerer/dag file processor - that's the main difference between
>> using pgbouncer and using our own Internal-API server.
>> 
>> IMHO - if we do not want to support DB access at all from workers,
>> triggerrers and DAG file processors, we should replace the current "DB"
>> bound interface with a new one specifically designed for this
>> bi-directional direct communication Executor <-> Workers, more in line with
>> what Jens described in AIP-69 (and for example WebSocket and asynchronous
>> communication comes immediately to my mind if I did not have to use DB for
>> that communication). This is also why I put the AIP-67 on hold because IF
>> we go that direction that we have "new" interface between worker, triggerer
>> , dag file processor - it might be way easier (and safer) to introduce
>> multi-team in Airflow 3 rather than 2 (or we can implement it differently
>> in Airflow 2 and differently in Airflow 3).
>> 
>> 
>> 
>> On Tue, Jun 4, 2024 at 3:58 PM Vikram Koka 
>> wrote:
>> 
>>> Fellow Airflowers,
>>> 
>>> I am following up on some of the proposed changes in the Airflow 3 proposal
>>> <
>>> https://docs.google.com/document/d/1MTr53101EISZaYidCUKcR6mRKshXGzW6DZFXGzetG3E/
 ,
>>> where more information was requested by the community, specifically around
>>> the injection of Task Execution Secrets. This topic

Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Jarek Potiuk
> I think we still need to enable the ability for DAGs at parse time to
access Variables.

It's actually DISCOURAGED ;) to access Variables at parse time - though we
have an experimental feature to make it efficient and we discussed whether
to treat it as "good practice" and there was strong opposition to that 
So I am not actually sure what will be our position on it.

On Fri, Jun 14, 2024 at 12:50 PM Ash Berlin-Taylor  wrote:

>
>
> > On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> >
> >  think in the future of Airflow 3 where
> > we will have task isolation, having `0` for all the DAGs will be a
> > prerequisite for switching to "task isolation" mode and this could be
> > actually verified in a migration tool.
>
> I think we still need to enable the ability for DAGs at parse time to
> access Variables.
>
> Or at least I am not proposing we remove that ability. (I wouldn’t be
> against it, but I was planning on continuing to support that for now)
>
> -ash


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Ash Berlin-Taylor


> On 14 Jun 2024, at 10:22, Jarek Potiuk  wrote:
> 
>  think in the future of Airflow 3 where
> we will have task isolation, having `0` for all the DAGs will be a
> prerequisite for switching to "task isolation" mode and this could be
> actually verified in a migration tool.

I think we still need to enable the ability for DAGs at parse time to access 
Variables.

Or at least I am not proposing we remove that ability. (I wouldn’t be against 
it, but I was planning on continuing to support that for now)

-ash

Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Ash Berlin-Taylor
We have `DAG Code` model in the database — which from memory (I haven’t checked 
just now) is the entire source of the file, even if there are multiple dags 
defined in one file, so we could add the columns to that row.

-ash

> On 14 Jun 2024, at 11:37, Jarek Potiuk  wrote:
> 
> Yep . Per DAG file is what I actually meant :)
> 
> On Fri, Jun 14, 2024 at 12:26 PM Eugen Kosteev  wrote:
> 
>> The thing is that it is "last count per DAG file".
>> I do not think we can actually calculate this per DAG, well we can split
>> total number of queries by number of DAGs in the file, but this maybe
>> confusing.
>> 
>> On Fri, Jun 14, 2024 at 12:24 PM Jarek Potiuk  wrote:
>> 
 the cardinality of those logs is too high.
>>> 
>>> I was thinking about only showing "last count per DAG" - then cardinality
>>> would be "good enough" I think. It could also be exposed via metrics now
>>> that I think of it - no real need to see it in UI or API.
>>> 
>>> On Fri, Jun 14, 2024 at 12:14 PM Kaxil Naik  wrote:
>>> 
 Yeah, valuable to show it in logs. For showing it in a web server or
 storing it in DB, the cardinality of those logs is too high.
 
 On Fri, 14 Jun 2024 at 11:09, Eugen Kosteev  wrote:
 
> Yeah, I also think it is a good idea to expose it in the Airflow UI.
> 
> Although, atm we do not have an entity such as DAG file (and this
 metric is
> per DAG file) in Airflow database, so we would need to design it a
 little
> bit.
> And attaching to the DAG model is not correct.
> 
> But I totally agree, it would be good to have it in Airflow UI as well
 for
> "operation users" to have access to this information.
> 
> On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk 
 wrote:
> 
>> Good idea, it would also be good if we could have access to the
> information
>> exposed in the UI - so that "operations users" can see it and maybe
 even
>> act on it + API/ CLI to check it. I think in the future of Airflow 3
> where
>> we will have task isolation, having `0` for all the DAGs will be a
>> prerequisite for switching to "task isolation" mode and this could be
>> actually verified in a migration tool.
>> 
>> On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev 
> wrote:
>> 
>>> Hi.
>>> 
>>> I would like to discuss the proposal of adding a new column to the
 "DAG
>>> File Processing Stats" of DAG processor logs.
>>> 
>>> Currently in the logs of DAG processor, there is following data
>>> (screenshot below) that includes # of DAGs, runtime, etc. per DAG
 file.
>>> [image: image.png]
>>> 
>>> It seems that it would be beneficial to have also there data about
 the
>>> number of queries performed to the Airflow database during parsing
 of
>> each
>>> file.
>>> It maybe convenient to have it in case of debugging issues related
 to
>> high
>>> load on Airflow database, e.g. typical scenario is when DAG file(s)
> have
>>> a lot of queries to database done on the top level of code and
 those
> are
>>> executed each time during parsing of these DAG files.
>>> One common example is excessive usage of "Variables.get" as
 top-level
>>> statements in DAG files.
>>> 
>>> Having information about "number of queries to Airflow database"
 per
> DAG
>>> file may help a lot during debugging issues related to high load on
>>> database or issues related to long parsing of the DAG files.
>>> 
>>> One caveat is that due to e.g. caching enabled for Variables or
 because
>> of
>>> other reasons (dynamic DAGs), number of queries may be very
 different
> for
>>> each parsing of the DAG file,
>>> but at least we can have it as "Last Run Number of Queries" - that
> would
>>> already give some idea and engineer can also review logs
 historically
> to
>>> see its data in the past.
>>> 
>>> What are your thoughts?
>>> 
>>> --
>>> Eugene
>>> 
>> 
> 
> 
> --
> Eugene
> 
 
>>> 
>> 
>> --
>> Eugene
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Jarek Potiuk
Yep . Per DAG file is what I actually meant :)

On Fri, Jun 14, 2024 at 12:26 PM Eugen Kosteev  wrote:

> The thing is that it is "last count per DAG file".
> I do not think we can actually calculate this per DAG, well we can split
> total number of queries by number of DAGs in the file, but this maybe
> confusing.
>
> On Fri, Jun 14, 2024 at 12:24 PM Jarek Potiuk  wrote:
>
>> >  the cardinality of those logs is too high.
>>
>> I was thinking about only showing "last count per DAG" - then cardinality
>> would be "good enough" I think. It could also be exposed via metrics now
>> that I think of it - no real need to see it in UI or API.
>>
>> On Fri, Jun 14, 2024 at 12:14 PM Kaxil Naik  wrote:
>>
>>> Yeah, valuable to show it in logs. For showing it in a web server or
>>> storing it in DB, the cardinality of those logs is too high.
>>>
>>> On Fri, 14 Jun 2024 at 11:09, Eugen Kosteev  wrote:
>>>
>>> > Yeah, I also think it is a good idea to expose it in the Airflow UI.
>>> >
>>> > Although, atm we do not have an entity such as DAG file (and this
>>> metric is
>>> > per DAG file) in Airflow database, so we would need to design it a
>>> little
>>> > bit.
>>> > And attaching to the DAG model is not correct.
>>> >
>>> > But I totally agree, it would be good to have it in Airflow UI as well
>>> for
>>> > "operation users" to have access to this information.
>>> >
>>> > On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk 
>>> wrote:
>>> >
>>> > > Good idea, it would also be good if we could have access to the
>>> > information
>>> > > exposed in the UI - so that "operations users" can see it and maybe
>>> even
>>> > > act on it + API/ CLI to check it. I think in the future of Airflow 3
>>> > where
>>> > > we will have task isolation, having `0` for all the DAGs will be a
>>> > > prerequisite for switching to "task isolation" mode and this could be
>>> > > actually verified in a migration tool.
>>> > >
>>> > > On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev 
>>> > wrote:
>>> > >
>>> > > > Hi.
>>> > > >
>>> > > > I would like to discuss the proposal of adding a new column to the
>>> "DAG
>>> > > > File Processing Stats" of DAG processor logs.
>>> > > >
>>> > > > Currently in the logs of DAG processor, there is following data
>>> > > > (screenshot below) that includes # of DAGs, runtime, etc. per DAG
>>> file.
>>> > > > [image: image.png]
>>> > > >
>>> > > > It seems that it would be beneficial to have also there data about
>>> the
>>> > > > number of queries performed to the Airflow database during parsing
>>> of
>>> > > each
>>> > > > file.
>>> > > > It maybe convenient to have it in case of debugging issues related
>>> to
>>> > > high
>>> > > > load on Airflow database, e.g. typical scenario is when DAG file(s)
>>> > have
>>> > > > a lot of queries to database done on the top level of code and
>>> those
>>> > are
>>> > > > executed each time during parsing of these DAG files.
>>> > > > One common example is excessive usage of "Variables.get" as
>>> top-level
>>> > > > statements in DAG files.
>>> > > >
>>> > > > Having information about "number of queries to Airflow database"
>>> per
>>> > DAG
>>> > > > file may help a lot during debugging issues related to high load on
>>> > > > database or issues related to long parsing of the DAG files.
>>> > > >
>>> > > > One caveat is that due to e.g. caching enabled for Variables or
>>> because
>>> > > of
>>> > > > other reasons (dynamic DAGs), number of queries may be very
>>> different
>>> > for
>>> > > > each parsing of the DAG file,
>>> > > > but at least we can have it as "Last Run Number of Queries" - that
>>> > would
>>> > > > already give some idea and engineer can also review logs
>>> historically
>>> > to
>>> > > > see its data in the past.
>>> > > >
>>> > > > What are your thoughts?
>>> > > >
>>> > > > --
>>> > > > Eugene
>>> > > >
>>> > >
>>> >
>>> >
>>> > --
>>> > Eugene
>>> >
>>>
>>
>
> --
> Eugene
>


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Eugen Kosteev
The thing is that it is "last count per DAG file".
I do not think we can actually calculate this per DAG, well we can split
total number of queries by number of DAGs in the file, but this maybe
confusing.

On Fri, Jun 14, 2024 at 12:24 PM Jarek Potiuk  wrote:

> >  the cardinality of those logs is too high.
>
> I was thinking about only showing "last count per DAG" - then cardinality
> would be "good enough" I think. It could also be exposed via metrics now
> that I think of it - no real need to see it in UI or API.
>
> On Fri, Jun 14, 2024 at 12:14 PM Kaxil Naik  wrote:
>
>> Yeah, valuable to show it in logs. For showing it in a web server or
>> storing it in DB, the cardinality of those logs is too high.
>>
>> On Fri, 14 Jun 2024 at 11:09, Eugen Kosteev  wrote:
>>
>> > Yeah, I also think it is a good idea to expose it in the Airflow UI.
>> >
>> > Although, atm we do not have an entity such as DAG file (and this
>> metric is
>> > per DAG file) in Airflow database, so we would need to design it a
>> little
>> > bit.
>> > And attaching to the DAG model is not correct.
>> >
>> > But I totally agree, it would be good to have it in Airflow UI as well
>> for
>> > "operation users" to have access to this information.
>> >
>> > On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk  wrote:
>> >
>> > > Good idea, it would also be good if we could have access to the
>> > information
>> > > exposed in the UI - so that "operations users" can see it and maybe
>> even
>> > > act on it + API/ CLI to check it. I think in the future of Airflow 3
>> > where
>> > > we will have task isolation, having `0` for all the DAGs will be a
>> > > prerequisite for switching to "task isolation" mode and this could be
>> > > actually verified in a migration tool.
>> > >
>> > > On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev 
>> > wrote:
>> > >
>> > > > Hi.
>> > > >
>> > > > I would like to discuss the proposal of adding a new column to the
>> "DAG
>> > > > File Processing Stats" of DAG processor logs.
>> > > >
>> > > > Currently in the logs of DAG processor, there is following data
>> > > > (screenshot below) that includes # of DAGs, runtime, etc. per DAG
>> file.
>> > > > [image: image.png]
>> > > >
>> > > > It seems that it would be beneficial to have also there data about
>> the
>> > > > number of queries performed to the Airflow database during parsing
>> of
>> > > each
>> > > > file.
>> > > > It maybe convenient to have it in case of debugging issues related
>> to
>> > > high
>> > > > load on Airflow database, e.g. typical scenario is when DAG file(s)
>> > have
>> > > > a lot of queries to database done on the top level of code and those
>> > are
>> > > > executed each time during parsing of these DAG files.
>> > > > One common example is excessive usage of "Variables.get" as
>> top-level
>> > > > statements in DAG files.
>> > > >
>> > > > Having information about "number of queries to Airflow database" per
>> > DAG
>> > > > file may help a lot during debugging issues related to high load on
>> > > > database or issues related to long parsing of the DAG files.
>> > > >
>> > > > One caveat is that due to e.g. caching enabled for Variables or
>> because
>> > > of
>> > > > other reasons (dynamic DAGs), number of queries may be very
>> different
>> > for
>> > > > each parsing of the DAG file,
>> > > > but at least we can have it as "Last Run Number of Queries" - that
>> > would
>> > > > already give some idea and engineer can also review logs
>> historically
>> > to
>> > > > see its data in the past.
>> > > >
>> > > > What are your thoughts?
>> > > >
>> > > > --
>> > > > Eugene
>> > > >
>> > >
>> >
>> >
>> > --
>> > Eugene
>> >
>>
>

-- 
Eugene


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Jarek Potiuk
>  the cardinality of those logs is too high.

I was thinking about only showing "last count per DAG" - then cardinality
would be "good enough" I think. It could also be exposed via metrics now
that I think of it - no real need to see it in UI or API.

On Fri, Jun 14, 2024 at 12:14 PM Kaxil Naik  wrote:

> Yeah, valuable to show it in logs. For showing it in a web server or
> storing it in DB, the cardinality of those logs is too high.
>
> On Fri, 14 Jun 2024 at 11:09, Eugen Kosteev  wrote:
>
> > Yeah, I also think it is a good idea to expose it in the Airflow UI.
> >
> > Although, atm we do not have an entity such as DAG file (and this metric
> is
> > per DAG file) in Airflow database, so we would need to design it a little
> > bit.
> > And attaching to the DAG model is not correct.
> >
> > But I totally agree, it would be good to have it in Airflow UI as well
> for
> > "operation users" to have access to this information.
> >
> > On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk  wrote:
> >
> > > Good idea, it would also be good if we could have access to the
> > information
> > > exposed in the UI - so that "operations users" can see it and maybe
> even
> > > act on it + API/ CLI to check it. I think in the future of Airflow 3
> > where
> > > we will have task isolation, having `0` for all the DAGs will be a
> > > prerequisite for switching to "task isolation" mode and this could be
> > > actually verified in a migration tool.
> > >
> > > On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev 
> > wrote:
> > >
> > > > Hi.
> > > >
> > > > I would like to discuss the proposal of adding a new column to the
> "DAG
> > > > File Processing Stats" of DAG processor logs.
> > > >
> > > > Currently in the logs of DAG processor, there is following data
> > > > (screenshot below) that includes # of DAGs, runtime, etc. per DAG
> file.
> > > > [image: image.png]
> > > >
> > > > It seems that it would be beneficial to have also there data about
> the
> > > > number of queries performed to the Airflow database during parsing of
> > > each
> > > > file.
> > > > It maybe convenient to have it in case of debugging issues related to
> > > high
> > > > load on Airflow database, e.g. typical scenario is when DAG file(s)
> > have
> > > > a lot of queries to database done on the top level of code and those
> > are
> > > > executed each time during parsing of these DAG files.
> > > > One common example is excessive usage of "Variables.get" as top-level
> > > > statements in DAG files.
> > > >
> > > > Having information about "number of queries to Airflow database" per
> > DAG
> > > > file may help a lot during debugging issues related to high load on
> > > > database or issues related to long parsing of the DAG files.
> > > >
> > > > One caveat is that due to e.g. caching enabled for Variables or
> because
> > > of
> > > > other reasons (dynamic DAGs), number of queries may be very different
> > for
> > > > each parsing of the DAG file,
> > > > but at least we can have it as "Last Run Number of Queries" - that
> > would
> > > > already give some idea and engineer can also review logs historically
> > to
> > > > see its data in the past.
> > > >
> > > > What are your thoughts?
> > > >
> > > > --
> > > > Eugene
> > > >
> > >
> >
> >
> > --
> > Eugene
> >
>


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Kaxil Naik
Yeah, valuable to show it in logs. For showing it in a web server or
storing it in DB, the cardinality of those logs is too high.

On Fri, 14 Jun 2024 at 11:09, Eugen Kosteev  wrote:

> Yeah, I also think it is a good idea to expose it in the Airflow UI.
>
> Although, atm we do not have an entity such as DAG file (and this metric is
> per DAG file) in Airflow database, so we would need to design it a little
> bit.
> And attaching to the DAG model is not correct.
>
> But I totally agree, it would be good to have it in Airflow UI as well for
> "operation users" to have access to this information.
>
> On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk  wrote:
>
> > Good idea, it would also be good if we could have access to the
> information
> > exposed in the UI - so that "operations users" can see it and maybe even
> > act on it + API/ CLI to check it. I think in the future of Airflow 3
> where
> > we will have task isolation, having `0` for all the DAGs will be a
> > prerequisite for switching to "task isolation" mode and this could be
> > actually verified in a migration tool.
> >
> > On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev 
> wrote:
> >
> > > Hi.
> > >
> > > I would like to discuss the proposal of adding a new column to the "DAG
> > > File Processing Stats" of DAG processor logs.
> > >
> > > Currently in the logs of DAG processor, there is following data
> > > (screenshot below) that includes # of DAGs, runtime, etc. per DAG file.
> > > [image: image.png]
> > >
> > > It seems that it would be beneficial to have also there data about the
> > > number of queries performed to the Airflow database during parsing of
> > each
> > > file.
> > > It maybe convenient to have it in case of debugging issues related to
> > high
> > > load on Airflow database, e.g. typical scenario is when DAG file(s)
> have
> > > a lot of queries to database done on the top level of code and those
> are
> > > executed each time during parsing of these DAG files.
> > > One common example is excessive usage of "Variables.get" as top-level
> > > statements in DAG files.
> > >
> > > Having information about "number of queries to Airflow database" per
> DAG
> > > file may help a lot during debugging issues related to high load on
> > > database or issues related to long parsing of the DAG files.
> > >
> > > One caveat is that due to e.g. caching enabled for Variables or because
> > of
> > > other reasons (dynamic DAGs), number of queries may be very different
> for
> > > each parsing of the DAG file,
> > > but at least we can have it as "Last Run Number of Queries" - that
> would
> > > already give some idea and engineer can also review logs historically
> to
> > > see its data in the past.
> > >
> > > What are your thoughts?
> > >
> > > --
> > > Eugene
> > >
> >
>
>
> --
> Eugene
>


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Eugen Kosteev
Yeah, I also think it is a good idea to expose it in the Airflow UI.

Although, atm we do not have an entity such as DAG file (and this metric is
per DAG file) in Airflow database, so we would need to design it a little
bit.
And attaching to the DAG model is not correct.

But I totally agree, it would be good to have it in Airflow UI as well for
"operation users" to have access to this information.

On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk  wrote:

> Good idea, it would also be good if we could have access to the information
> exposed in the UI - so that "operations users" can see it and maybe even
> act on it + API/ CLI to check it. I think in the future of Airflow 3 where
> we will have task isolation, having `0` for all the DAGs will be a
> prerequisite for switching to "task isolation" mode and this could be
> actually verified in a migration tool.
>
> On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev  wrote:
>
> > Hi.
> >
> > I would like to discuss the proposal of adding a new column to the "DAG
> > File Processing Stats" of DAG processor logs.
> >
> > Currently in the logs of DAG processor, there is following data
> > (screenshot below) that includes # of DAGs, runtime, etc. per DAG file.
> > [image: image.png]
> >
> > It seems that it would be beneficial to have also there data about the
> > number of queries performed to the Airflow database during parsing of
> each
> > file.
> > It maybe convenient to have it in case of debugging issues related to
> high
> > load on Airflow database, e.g. typical scenario is when DAG file(s) have
> > a lot of queries to database done on the top level of code and those are
> > executed each time during parsing of these DAG files.
> > One common example is excessive usage of "Variables.get" as top-level
> > statements in DAG files.
> >
> > Having information about "number of queries to Airflow database" per DAG
> > file may help a lot during debugging issues related to high load on
> > database or issues related to long parsing of the DAG files.
> >
> > One caveat is that due to e.g. caching enabled for Variables or because
> of
> > other reasons (dynamic DAGs), number of queries may be very different for
> > each parsing of the DAG file,
> > but at least we can have it as "Last Run Number of Queries" - that would
> > already give some idea and engineer can also review logs historically to
> > see its data in the past.
> >
> > What are your thoughts?
> >
> > --
> > Eugene
> >
>


-- 
Eugene


Re: [DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Jarek Potiuk
Good idea, it would also be good if we could have access to the information
exposed in the UI - so that "operations users" can see it and maybe even
act on it + API/ CLI to check it. I think in the future of Airflow 3 where
we will have task isolation, having `0` for all the DAGs will be a
prerequisite for switching to "task isolation" mode and this could be
actually verified in a migration tool.

On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev  wrote:

> Hi.
>
> I would like to discuss the proposal of adding a new column to the "DAG
> File Processing Stats" of DAG processor logs.
>
> Currently in the logs of DAG processor, there is following data
> (screenshot below) that includes # of DAGs, runtime, etc. per DAG file.
> [image: image.png]
>
> It seems that it would be beneficial to have also there data about the
> number of queries performed to the Airflow database during parsing of each
> file.
> It maybe convenient to have it in case of debugging issues related to high
> load on Airflow database, e.g. typical scenario is when DAG file(s) have
> a lot of queries to database done on the top level of code and those are
> executed each time during parsing of these DAG files.
> One common example is excessive usage of "Variables.get" as top-level
> statements in DAG files.
>
> Having information about "number of queries to Airflow database" per DAG
> file may help a lot during debugging issues related to high load on
> database or issues related to long parsing of the DAG files.
>
> One caveat is that due to e.g. caching enabled for Variables or because of
> other reasons (dynamic DAGs), number of queries may be very different for
> each parsing of the DAG file,
> but at least we can have it as "Last Run Number of Queries" - that would
> already give some idea and engineer can also review logs historically to
> see its data in the past.
>
> What are your thoughts?
>
> --
> Eugene
>


[DISCUSS] Number of queries to Airflow database in "DAG File Processing Stats"

2024-06-14 Thread Eugen Kosteev
Hi.

I would like to discuss the proposal of adding a new column to the "DAG
File Processing Stats" of DAG processor logs.

Currently in the logs of DAG processor, there is following data
(screenshot below) that includes # of DAGs, runtime, etc. per DAG file.
[image: image.png]

It seems that it would be beneficial to have also there data about the
number of queries performed to the Airflow database during parsing of each
file.
It maybe convenient to have it in case of debugging issues related to high
load on Airflow database, e.g. typical scenario is when DAG file(s) have
a lot of queries to database done on the top level of code and those are
executed each time during parsing of these DAG files.
One common example is excessive usage of "Variables.get" as top-level
statements in DAG files.

Having information about "number of queries to Airflow database" per DAG
file may help a lot during debugging issues related to high load on
database or issues related to long parsing of the DAG files.

One caveat is that due to e.g. caching enabled for Variables or because of
other reasons (dynamic DAGs), number of queries may be very different for
each parsing of the DAG file,
but at least we can have it as "Last Run Number of Queries" - that would
already give some idea and engineer can also review logs historically to
see its data in the past.

What are your thoughts?

-- 
Eugene


Re: Google Provider Package System Tests Dashboard

2024-06-14 Thread Michał Modras
Great work Freddy!

On Thu, Jun 13, 2024 at 4:08 AM Wei Lee  wrote:

> This is great!
>
> Best,
> Wei
>
> > On Jun 12, 2024, at 9:47 PM, Bishundeo, Rajeshwar
>  wrote:
> >
> > Fantastic job from the Google team. Love it!!
> >
> > -- Rajesh
> >
> >
> >
> >
> >
> >
> > On 2024-06-12, 9:20 AM, "Pankaj Koti"    pankaj.k...@astronomer.io.inva>LID> wrote:
> >
> >
> > CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> >
> >
> >
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous
> ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> certain que le contenu ne présente aucun risque.
> >
> >
> >
> >
> >
> >
> > yes, indeed a great one!
> >
> >
> >
> >
> > Best regards,
> >
> >
> > *Pankaj Koti*
> > Senior Software Engineer (Airflow OSS Engineering team)
> > Location: Pune, Maharashtra, India
> > Timezone: Indian Standard Time (IST)
> >
> >
> >
> >
> > On Wed, Jun 12, 2024 at 6:42 PM Vincent Beck   > wrote:
> >
> >
> >> Really nice!!
> >>
> >> On 2024/06/12 11:38:13 Jarek Potiuk wrote:
>  I believe we should include a reference to it in the Google provider
> >>> documentation.
> >>>
> >>> Already there: https://github.com/apache/airflow/pull/40102 <
> https://github.com/apache/airflow/pull/40102> in the
> >> README
> >>> documentation (which is where it should be as it is mostly
> >>> maintainer/contributor docs rather than user's docs.
> >>>
> >>> J.
> >>>
> >>>
> >>> On Wed, Jun 12, 2024 at 1:34 PM Pankaj Singh   >
> >>> wrote:
> >>>
>  Nice! Thanks for sharing. I believe we should include a reference to
> >> it in
>  the Google provider documentation.
> 
>  On Wed, Jun 12, 2024 at 3:24 PM Freddy Demiane
> >> mailto:fdemi...@google.com.inva>  fdemi...@google.com.inva>lid
> >
>  wrote:
> 
> > Hi Team,
> >
> > At Google Cloud Composer, we developed a public dashboard that shows
> >> the
> > results of the System Tests of the head revision of the Google
> >> Provider
> > Package against the head revision of Apache Airflow. This will help
>  detect
> > regressions caused by modifying an operator or a system test. At the
> > moment, the system tests are executed every 6 hours. Here is the
> >> link for
> > the dashboard:
> >
> >> https://storage.googleapis.com/providers-dashboard-html/dashboard.html
>  .
> > We hope this eases the development process!
> >
> > Best,
> > Freddy
> >
> 
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org  dev-unsubscr...@airflow.apache.org>  dev-unsubscr...@airflow.apache.org>
> >> For additional commands, e-mail: dev-h...@airflow.apache.org  dev-h...@airflow.apache.org> 
> >>
> >>
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org  dev-unsubscr...@airflow.apache.org>
> > For additional commands, e-mail: dev-h...@airflow.apache.org  dev-h...@airflow.apache.org>
>