Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Jarek Potiuk
Also if we are speaking about indexes - a bit tangential but I know we were planning to replace some of the primary keys (mainly because of mysql limitations) with synthetic keys for DAG versioning casse where we planned to use UUIDs). We should be very, very careful when doing it because I've lea

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread James Duong
Many of the MSSQL customers using Airflow with MSSQL as the backend are unlikely to participate in those types of surveys, unfortunately, so I fear the numbers are biased. We have had direct feedback from multiple very large MSSQL customers who see the removal of this support as a large blocker

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Ephraim Anierobi
I also agree with others and aside from the survey, MSSQL was a headache. I think so many pain points would delay Airflow 3 development if we reconsider MSSQL. Maybe any reconsideration should be after Airflow 3? On Thu, 30 May 2024 at 23:48, Jarek Potiuk wrote: > Agree with all comments above.

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Pankaj Koti
Addressing one of Pierre's questions: Should I index foreign keys? Is that done by default or should I explicitly do it? The answer varies depending on the database engine. PostgreSQL and SQLite do not add indexes on foreign keys by default, while MySQL does. Developers should keep this in mind.

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Daniel Standish
I would be in favor of this for sure. Let's see what others think :) On Thu, May 30, 2024 at 10:55 PM Jarek Potiuk wrote: > Simply speaking - let's make "lack of optimisation for these and that" part > of the API specification. > > On Fri, May 31, 2024 at 7:54 AM Jarek Potiuk wrote: > > > So l

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Jarek Potiuk
Simply speaking - let's make "lack of optimisation for these and that" part of the API specification. On Fri, May 31, 2024 at 7:54 AM Jarek Potiuk wrote: > So let's document as part of the API which queries are not performant and > suggest users that want to use them to make their analytics quer

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Jarek Potiuk
So let's document as part of the API which queries are not performant and suggest users that want to use them to make their analytics queries elsewhere. I'd very much prefer that it's slow "by design" for everyone rather than add option for the user to speed them up where we decided not to do it ou

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Daniel Standish
So I think the notion that *all possibly expensive queries* should have an index to support them is not a tenable one. E.g. there are something like 5 params on TI list endpoint that don't have an index. In contrast with queries from airflow itself, the API queries are more arbitrary -- user can

Re: [VOTE] Airflow Providers prepared on May 30, 2024

2024-05-30 Thread Wei Lee
+1 (non-binding) Tested my changes and our example DAGs without encountering issues. Best, Wei > On May 30, 2024, at 9:28 PM, Elad Kalif wrote: > > Hey all, > > I have just cut the new wave Airflow Providers packages. This email is > calling a vote on the release, > which will last for 72 hou

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Jarek Potiuk
The API is public, it **should** behave well regardless of local customizations. We have automated DB maintenance and we "promise" to our users it will work and we explicitly tell them "do not touch airflow DB as you might break things". Now - If we change the narration now and tell them "if you w

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Jarek Potiuk
Agree with all comments above. Also I think bringing MySQL back is going to make it way more complex to implement some of the improvements we thought about - mostly async DB operations (only recently - November 2023 async support has been added to MSSQL and we know from the history that MSSQL gave

RE: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Damian Shaw
I would say that MSSQL was often marked as "experimental" (https://airflow.apache.org/docs/apache-airflow/2.6.0/howto/set-up-database.html), so IMO I don't think the evidence of it only being used by 5% is particularly convincing that it wouldn't eventually be popular. Users who might want to p

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Andrey Anshin
There was a proposal to keep it in the past [1] with a short explanation why the maintainers did not want to keep it. > many Microsoft customers who are using Airflow Microsoft also supports and participates in the development of PostgreSQL, there is one Core Team member and couple of Major Cont

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Pierre Jeambrun
I share Jed feeling. The effort required to maintain those compare to the value it actually brings combined with the usage from the survey, it doesn’t seem worth it to me. On Thu 30 May 2024 at 19:16, Jed Cunningham wrote: > Just for context, here were (roughly) the results from the 2023 Airflow

Re: [DISCUSS] Restore the SQL server backend

2024-05-30 Thread Jed Cunningham
Just for context, here were (roughly) the results from the 2023 Airflow survey: PostgreSQL: 75% MySQL: 15% MSSQL: 5% Also, there are already discussions about potentially dropping MySQL support in Airflow 3. Given all that and the points from the past vote, I don't think it makes much sense to br

[DISCUSS] Restore the SQL server backend

2024-05-30 Thread James Duong
Hi Airflow community! The support for SQL has been removed starting Airflow version 2.9.0 and this is a concern for many Microsoft customers who are using Airflow. Many customers have a multi-cloud

[ANNOUNCE] Podcast Launch- The Data Flowcast: Mastering Airflow for Data Engineering & AI

2024-05-30 Thread Briana Okyere
Hey All, Very excited to announce the relaunch of the Airflow Podcast, now titled "the relaunch of our podcast, now titled "The Data Flowcast: Mastering Airflow for Data Engineering & AI." This podcast is specially designed for the Apache Airflow community and aims to share invaluable insights, u

[VOTE] Airflow Providers prepared on May 30, 2024

2024-05-30 Thread Elad Kalif
Hey all, I have just cut the new wave Airflow Providers packages. This email is calling a vote on the release, which will last for 72 hours - which means that it will end on June 02, 2024 13:25 PM UTC and until 3 binding +1 votes have been received. Consider this my (binding) +1. Airflow Provide

Re: [DISCUSS] indexes for API calls

2024-05-30 Thread Pierre Jeambrun
Thank you for starting this discussion. At first I would say that databases should be indexed to achieve good performances against standard queries / use cases. The Rest API does not do any crazy things/querying (as I recall). Listing, filtering, ordering and searching against our main tables sho

Re: [VOTE] May 2024 PR of the Month

2024-05-30 Thread Hussein Awala
+1 for #39336 On Thu, May 30, 2024 at 8:47 AM Pierre Jeambrun wrote: > Well, the task try_number was an issue for as long as I can remember and we > had a few tries attempting to fix it in the past, it was a pain to work > with. > > I am really glad to see it in a more stable state, good job! >