Re: [Discussion] Deprecate auto cleanup RenderedTaskInstanceFields and decouple k8s_pod_yaml

2023-01-30 Thread Jarek Potiuk
COMMENT: While writing the answer here, I think I found a deeper problem (and optimisation needed) - i.e I think the delete should be even more fine-grained than it is today and include map_index) - please take a look at the end (Also maybe TP might comment on that one). > 1. Additional indexes a

Re: [VOTE] PR of the Month

2023-01-30 Thread Ping Zhang
Hi John, I vote for [28300] Add Public Interface description to Airflow documentation Thanks, Ping On Mon, Jan 30, 2023 at 11:17 AM John Thomas wrote: > It's a new year and a new newsletter! This vote is a bit late because I > had to do some work on the script, but as usual it closes at 9a M

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Mehta, Shubham
Big +1 to this proposal. Thanks for initiating it, Jarek. This will accelerate innovation for K8s and celery executor and encourage other providers, like Google and Amazon, to include cloud native executors in their packages. Couple of questions/concerns: 1. How does this impact work of releas

Re: Request for feedback on proposal for new OpenLineage provider in Airflow

2023-01-30 Thread Julien Le Dem
Thank you very much for your input Jarek. I am responding in the comments and adding to the doc accordingly. I would also love to hear from more stakeholders. Thanks to all who provided feedback so far. Julien On Fri, Jan 27, 2023 at 12:57 AM Jarek Potiuk wrote: > General comment from my side: I

Re: [Discussion] Deprecate auto cleanup RenderedTaskInstanceFields and decouple k8s_pod_yaml

2023-01-30 Thread Andrey Anshin
There are some potential drawbacks in your suggestion (you never know until you try). 1. Additional indexes add additional performance degradation on Insert but gain potential improvements on delete and unknown on update, RDBMS still require rebalance index and make it consistent to the table. 2.

Re: [Discussion] Deprecate auto cleanup RenderedTaskInstanceFields and decouple k8s_pod_yaml

2023-01-30 Thread Kaxil Naik
>I think we should not deprecate it though, but find a more efficient way of deleting the old keys. I think we could slightly denormalize RenderedTaskInstance + DagRun tables, and add DAG_RUN_EXECUTION_DATE to the RenderedTaskInstance table and that will be enough to optimise it. yeah I agree with

Re: [VOTE] PR of the Month

2023-01-30 Thread Kaxil Naik
+1 on 27063 too On Mon, 30 Jan 2023 at 20:13, Jarek Potiuk wrote: > I vote for #27063 too. > > On Mon, Jan 30, 2023 at 8:28 PM Pierre Jeambrun > wrote: > > > > Hello, > > > > I vote for #27063 :) > > > > Le lun. 30 janv. 2023 à 20:17, John Thomas

Re: Thoughts on Adding Weaviate Provider?

2023-01-30 Thread Jarek Potiuk
Cool. Glad I could help. On Sun, Jan 29, 2023 at 10:12 PM Marcus Eagan wrote: > > J, > > Not discouraging at all. I reached out in search of clarity that I now have. > > Thanks for the detailed response. I think all of this makes a ton of sense. > Upon first sight of how the provider ecosystem h

Re: [VOTE] PR of the Month

2023-01-30 Thread Jarek Potiuk
I vote for #27063 too. On Mon, Jan 30, 2023 at 8:28 PM Pierre Jeambrun wrote: > > Hello, > > I vote for #27063 :) > > Le lun. 30 janv. 2023 à 20:17, John Thomas > a écrit : >> >> It's a new year and a new newsletter! This vote is a bit late because I had >> to do some work on the script, but a

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Jarek Potiuk
> Should we also create apache-airflow-providers-dask for DaskExecutor? Absolutely. On Mon, Jan 30, 2023 at 8:14 PM Andrey Anshin wrote: > > Should we also create apache-airflow-providers-dask for DaskExecutor? > > > Best Wishes > Andrey Anshin > > > > On Sun, 29 Jan 2023 at 13:21, Jarek Po

Re: [Discussion] Deprecate auto cleanup RenderedTaskInstanceFields and decouple k8s_pod_yaml

2023-01-30 Thread Jarek Potiuk
I think there is a good reason to clean those up automatically. rendered task instance fields are almost arbitrary in size. If we try to keep all historical values there by default, there are numerous cases it will grow very fast - far, far too quickly. And I am not worried at all about locks on t

Re: [Discussion] Deprecate auto cleanup RenderedTaskInstanceFields and decouple k8s_pod_yaml

2023-01-30 Thread Andrey Anshin
I guess two things involved to reduce performance on this query through the time: Dynamic Task Mapping and run_id instead of execution date. I still personally think that changing the default value from 30 to 0 might improve performance of multiple concurrent tasks, just because this query does no

Re: [VOTE] PR of the Month

2023-01-30 Thread Pierre Jeambrun
Hello, I vote for #27063 :) Le lun. 30 janv. 2023 à 20:17, John Thomas a écrit : > It's a new year and a new newsletter! This vote is a bit late because I > had to do some work on the script, but as usual it closes at 9a MST on > Wednesday, Feb 1. > > The script has highlighted the following, b

[VOTE] PR of the Month

2023-01-30 Thread John Thomas
It's a new year and a new newsletter! This vote is a bit late because I had to do some work on the script, but as usual it closes at 9a MST on Wednesday, Feb 1. The script has highlighted the following, but feel free to vote for / nominate any you like. Since we didn't run a vote in December, anyt

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Andrey Anshin
Should we also create apache-airflow-providers-dask for DaskExecutor? Best Wishes *Andrey Anshin* On Sun, 29 Jan 2023 at 13:21, Jarek Potiuk wrote: > Hello Everyone, > > As a follow-up to AIP-51 - when it is completed (with few more quirks > like the one described by Andrey in the "Rende

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Ping Zhang
Hi Jarek, Thanks for bringing this up. I think it is a great idea to modularize the executors to providers especially after the AIP-51. Thanks, Ping On Sun, Jan 29, 2023 at 1:21 AM Jarek Potiuk wrote: > Hello Everyone, > > As a follow-up to AIP-51 - when it is completed (with few more quirks

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Ferruzzi, Dennis
Yeah, alright. In that case I agree. I just wanted to make sure we weren't making it harder to onboard a new user. From: Jarek Potiuk Sent: Monday, January 30, 2023 10:18 AM To: dev@airflow.apache.org Subject: RE: [EXTERNAL][DISCUSSION] Move K8S and Celery Exe

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Jarek Potiuk
> Would this make it harder for a new user to find which executors are > available and get started? On one hand, for better or worse it makes the > executor choice more of a conscious decision. On the other hand unless we're > still setting one as the default and packaging it together - and yo

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

2023-01-30 Thread Ferruzzi, Dennis
Would this make it harder for a new user to find which executors are available and get started? On one hand, for better or worse it makes the executor choice more of a conscious decision. On the other hand unless we're still setting one as the default and packaging it together - and you did sa