Yep. If we can make both Postgres and MySQL work with Async - I am also all
for the "All" approach. If it means that we need to support only certain
drivers and certain versions of the DBs - so be it. As mentioned in my
original comments (long time ago when we had MSSQL support) - this was not
really possible back then - but now, by getting rid of Mssql and if we have
the right drivers for mysql, it should be possible - I guess.

On Mon, Apr 8, 2024 at 8:17 PM Daniel Standish
<daniel.stand...@astronomer.io.invalid> wrote:

> I wholeheartedly agree with Ash that it should be all or nothing.  And
> *all* sounds
> better to me :)
>
>
>
> On Mon, Apr 8, 2024 at 10:54 AM Ash Berlin-Taylor <a...@apache.org> wrote:
>
> > I’m all in favour of async SQLAlchemy. We’ve built two products
> > exclusively at @ Astronomer that use sqlalchemy+psycopg3+async and love
> it.
> > Async does take a bit of a learning curve, but SQLA has done it nicely
> and
> > it works really well.
> >
> > I think this needs to be an all or nothing thing — having to maintain
> sync
> > and async versions of functions/features is a non-starter in my mind;
> it’d
> > just be a worryingly large amount of duplicated work. Given the only DBs
> we
> > support now is postgres and mysql then I can’t think of any reason users
> > should even care — they give it a DSN and that’s the end of their
> > involvement.
> >
> > Amogh: I don’t understand what you mean by point 3 below.
> >
> > -ash
> >
> > > On 8 Apr 2024, at 05:31, Amogh Desai <amoghdesai....@gmail.com> wrote:
> > >
> > > I checked the content and the PR that you attached.
> > >
> > > The results do seem promising and I like the general idea of this
> > approach.
> > > But as Jarek
> > > also mentioned on the PR:
> > >
> > > 1. Not everyone might be on the board to go all async due to certain
> > > limitations around
> > > access to the drivers, or corporate limitations. So, we definitely
> need a
> > > way to opt-out
> > > for the ones who aren't interested.
> > >
> > > 2. We should have a seamless fallback to sync if async doesn't work for
> > > whatever reasons.
> > >
> > > 3. Are we going all in or are we limiting the scope to lets say
> > > connections + variables and expanding
> > > based on the results in the long term?
> > >
> > > Looking forward to improvements async can bring in!
> > >
> > > Thanks & Regards,
> > > Amogh Desai
> > >
> > >
> > > On Sun, Apr 7, 2024 at 3:13 AM Hussein Awala <huss...@awala.fr> wrote:
> > >
> > >> The Metadata Database is the brain of Airflow, where all scheduling
> > >> decisions, cross-communication, synchronization between components,
> and
> > >> management via the web server, are made using this database.
> > >>
> > >> One option to optimize the DB queries is to merge many into a single
> > query
> > >> to reduce latency and overall time, but this is not always possible
> > because
> > >> the queries are sometimes completely independent, and it is
> > impossible/too
> > >> complicated to merge them. But in this case, we have another option
> > which
> > >> is running them concurrently since they are independent. The only way
> > to do
> > >> this currently is to use multithreading (the sync_to_async decorator
> > >> creates a thread and waits for it using an asyncio coroutine), which
> is
> > >> already a good start, but by using the asyncio extension for
> sqlalchemy
> > we
> > >> will be able to create thousands of lightweight coroutines with the
> same
> > >> amount of resources as a few threads, which will also help to reduce
> > >> resources consumption.
> > >>
> > >> A few months ago I started a PoC to add support for this extension and
> > >> implement an asynchronous version of connections and variables to be
> > able
> > >> to get/set them from triggers without blocking the event loop and
> > affecting
> > >> the performance of the triggerer, and the result was impressive (
> > >> https://github.com/apache/airflow/pull/36504).
> > >>
> > >> I see a good opportunity to improve the performance of our REST API
> and
> > web
> > >> server (for example https://github.com/apache/airflow/issues/38776),
> > >> knowing that we can mix sync and async endpoints, which will help for
> a
> > >> smooth migration.
> > >>
> > >> I also think that it will be possible (and very useful) to migrate
> some
> > of
> > >> our executors to a full asynchronous version to improve their
> > performance
> > >> (kubernetes and celery)
> > >>
> > >> I use the sqlalchemy asyncio extension in many personal and company
> > >> projects, and I'm very happy with it, but I would like to hear from
> > others
> > >> if they have any positive or negative feedback about it.
> > >>
> > >> I will create a new AIP for integrating the asyncio extension of
> > >> sqlaclhemy, and other following AIPs to migrate/support each component
> > once
> > >> the first one is implemented, but first, I prefer to check what the
> > >> community and other committers think about this integration.
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to