I haven’t looked that much into the Airflow code yet, but the backend code must 
already be pluggable since it supports multiple databases.

What if we were to maintain a separate repo and artifact for just an extension 
for the SQL Server backend? This seems like it’d be low-impact on the Airflow 
community from a maintenance perspective, but keeps MSSQL users using the 
official Airflow artifacts. (Maybe just have the Airflow documentation link to 
the extension).

From: Kaxil Naik <kaxiln...@gmail.com>
Date: Thursday, June 13, 2024 at 9:53 AM
To: dev@airflow.apache.org <dev@airflow.apache.org>
Subject: Re: [DISCUSS] Restore the SQL server backend
The cost of maintaining it in Airflow repo (with CI/CI, GH issues etc) is,
unfortunately, just too much higher.

On Thu, 13 Jun 2024 at 17:11, James Duong <james.du...@improving.com.invalid>
wrote:

> Hi Jarek,
>
> Thanks for your response.
>
> I would prefer to make this work part of the main Airflow repository.
>
> My previous experience with maintaining forks of Apache projects has
> always been that it fragments the community unnecessarily. Users of a
> company-specific fork might ask for features (not just relating to MSSQL)
> that could benefit the community at large. The larger community can miss
> out on insights from the users and maintainers of the fork.
>
> There is a greater cost both getting updates from the main repo and
> upstreaming improvements to the main repo as the codebase diverges further.
>
> I could see confusion from end users if there are multiple sets of
> artifacts for different forks of Airflow about which one to get.
>
> What are your thoughts on this?
>
> From: Jarek Potiuk <ja...@potiuk.com>
> Date: Monday, June 3, 2024 at 9:32 PM
> To: dev@airflow.apache.org <dev@airflow.apache.org>
> Cc: james.du...@improving.com.invalid <james.du...@improving.com.invalid>
> Subject: Re: [DISCUSS] Restore the SQL server backend
> I am not sure if you read what I wrote with full understanding.
>
> To be perfectly honest - If you secure enough resources, I think *STILL* it
> will be better if you maintain your own fork and apply necessary changes
> and offer it commercially to anyone who needs it. This is way easier for
> the community, and better for you commercially  - and if you are **really**
> committed for a long term to do MSSQL, then you should have no problem in
> maintaining the fork.
>
> On Mon, Jun 3, 2024 at 11:15 PM James Duong
> <james.du...@improving.com.invalid> wrote:
>
> > Thanks for all of your feedback and discussion.
> >
> > The interest and usage from the enterprise MSSQL community is very large
> -
> > it's unfortunate that numbers are difficult to gather.
> >
> > In terms of the support - I hear you that it should not be limited to
> only
> > CI improvements and PR support and a more active role needs to be taken.
> I
> > am working on a plan that would provide the necessary involvement in the
> > community.
> >
> > Please allow me some time to see what is possible.
> >
> > From: Wei Lee <weilee...@gmail.com>
> > Date: Friday, May 31, 2024 at 8:45 AM
> > To: dev@airflow.apache.org <dev@airflow.apache.org>
> > Cc: james.du...@improving.com.invalid <james.du...@improving.com
> .invalid>
> > Subject: Re: [DISCUSS] Restore the SQL server backend
> > I agree with Jed and the following comments. If my memory serves me
> right,
> > this topic has been discussed a few times in the past. 5% doesn't seem
> very
> > convincing. Even if it's biased, I'm still not persuaded that there are a
> > large number of users that are worth the community's effort. And Jarek
> > pointed out a great solution for forking Airflow and adding MSSQL support
> > to it.
> >
> > Best,
> > Wei
> >
> > > On May 31, 2024, at 7:50 PM, Elad Kalif <elad...@apache.org> wrote:
> > >
> > > I agree with Jarek
> > >
> > > I am a bit worried about the mental model of this proposal as you are
> > > offering to deliver a feature but you are not offering being a
> community
> > > member.
> > > I had a lot of frustration with the MsSQL backend tests, it really
> caused
> > > me pain as a contributor. According to your mental model - will you
> > > actively review community PRs, triage Airflow issues and offer guidance
> > and
> > > help when needed about MsSQL or will the maintainers have to track
> these
> > > problems and actively tag you/your team for assistance?
> > >
> > > Let me give an example: User opens a Github issue about HA scheduler.
> > Will
> > > your team participate in the issue triage? Or do you expect the
> community
> > > to triage the issue and only after some discussion when it turns out
> that
> > > it's MsSQL specific issue then we need to notify you?
> > >
> > > On Fri, May 31, 2024 at 10:05 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > >
> > >>> We also understand and are ready to address the concerns stated in
> the
> > >> vote about support and resolving CI issues
> > >>
> > >> Hello James,
> > >>
> > >> Could you please explain how exactly are you planning to help a number
> > of
> > >> maintainers who are working on developing new feature to make sure
> > >> they know and realise unobvious consequences of some of the DB changes
> > they
> > >> might have when some of the features of MYSQL are causing - for
> example
> > >> heavy slowdown of  inserts because of rebalancing B-TREES on UUID
> index
> > for
> > >> databases (that unlike Postgres and MariaDB) lack native UUID support
> > (see
> > >> . How would you help with discovering similar type of issues see here
> > >> https://lists.apache.org/thread/7235o1bc3w4694sw8q9m4p58g3tdcjj7
> > >>
> > >> Could you please explain how many people, effort and dedicated
> resources
> > >> (i.e. continuous testing of stability and performance you are going to
> > >> spend on fixing those)?
> > >>
> > >> IMHO. If you see a LOT of users that want MsSQL support - you are
> > >> absolutely free to spend those money, effort and resources on making a
> > fork
> > >> of Airflow with MsSQL support and charge a premium for that (and a
> large
> > >> one). That seems like a very good business model to make if you see a
> > lot
> > >> of interest there.
> > >>
> > >> This is all perfectly fine according to our licence and community
> would
> > be
> > >> really thankful for someone who would take the burden of maintaining
> > MSSQL
> > >> while also making it possible for MSSQL users. Maybe that's the way to
> > go
> > >> for you?
> > >>
> > >> J,
> > >>
> > >>
> > >>
> > >> On Fri, May 31, 2024 at 8:32 AM James Duong
> > >> <james.du...@improving.com.invalid> wrote:
> > >>
> > >>> Many of the MSSQL customers using Airflow with MSSQL as the backend
> are
> > >>> unlikely to participate in those types of surveys, unfortunately, so
> I
> > >> fear
> > >>> the numbers are biased.  We have had direct feedback from multiple
> very
> > >>> large MSSQL customers who see the removal of this support as a large
> > >>> blocker to using Airflow.
> > >>>
> > >>> Although yes, Microsoft does support PostgreSQL (and MySQL), MSSQL is
> > an
> > >>> extremely widely used and popular Database platform across different
> > >>> segments whether Enterprise, Government, Major or SMC. Various
> Oracle,
> > >> IBM
> > >>> and OSS customers are diversifying their Database platform with SQL
> and
> > >> it
> > >>> is important for Airflow-type products to support SQL.
> > >>>
> > >>> We also understand and are ready to address the concerns stated in
> the
> > >>> vote about support and resolving CI issues.
> > >>>
> > >>> From: Jarek Potiuk <ja...@potiuk.com>
> > >>> Date: Thursday, May 30, 2024 at 3:47 PM
> > >>> To: dev@airflow.apache.org <dev@airflow.apache.org>
> > >>> Subject: Re: [DISCUSS] Restore the SQL server backend
> > >>> Agree with all comments above. Also I think bringing MySQL back is
> > going
> > >> to
> > >>> make it way more complex to implement some of the improvements we
> > thought
> > >>> about - mostly async DB operations (only recently - November 2023
> async
> > >>> support has been added to MSSQL and we know from the history that
> MSSQL
> > >>> gave us a lot of headache while developing it and there is no reason
> to
> > >>> believe it will be different. And "helping in CI" is not going to cut
> > it
> > >> -
> > >>> we need every maintainer who wants to implement a new DB change to
> > become
> > >>> expert on what is different in MSSQL.
> > >>>
> > >>> Honestly - if I'd lose 5% of users because their internal rules say
> > >>> MSSQL-only (and no Postgres, which as mentioned above is widely
> > supported
> > >>> and popular including Azure) at the expense of better performance,
> less
> > >>> resource usage (as we expect with asyncio) delivered faster to
> > remaining
> > >>> 95% users, then I know what my decision is.
> > >>>
> > >>> BTW. That's not really a criteria we use for such decisions about
> > >>> technology, but unlike Amazon and Google, Microsoft Azure Data
> Factory
> > >>> Airflow team is generally absent from any of those discussions we
> have
> > >>> here. Despite us reaching out in various ways they have never "Shown"
> > >> here,
> > >>> never contributed anything (or at least we have no knowledge about
> it)
> > -
> > >>> including contributions, improvements, system tests nor any other
> > >>> activities in the community. They are simply not giving back to the
> > >>> community.,
> > >>>
> > >>> If they did and officially said (and had proven as the Amazon and
> > Google
> > >>> team did multiple times for their integrations) that they are willing
> > to
> > >>> support and maintain MSSQL DB, maybe we would reconsider - mostly
> > because
> > >>> we could have counted on having them step in when needed (again - as
> it
> > >>> happened multiple times with Amazon and Google - when we reach out
> and
> > >> need
> > >>> their help we know we can count on it). I don't see a particular
> reason
> > >> why
> > >>> we should support their proprietary technology.
> > >>>
> > >>> J.
> > >>>
> > >>>
> > >>> On Fri, May 31, 2024 at 12:16 AM Damian Shaw <
> > >> ds...@striketechnologies.com
> > >>>>
> > >>> wrote:
> > >>>
> > >>>> I would say that MSSQL was often marked as "experimental" (
> > >>>>
> > >>>
> > >>
> >
> https://airflow.apache.org/docs/apache-airflow/2.6.0/howto/set-up-database.html
> > >>> ),
> > >>>> so IMO I don't think the evidence of it only being used by 5% is
> > >>>> particularly convincing that it wouldn't eventually be popular.
> Users
> > >> who
> > >>>> might want to primarily use MSSQL because of internal corporate
> > >>>> restrictions might have a large overlap with users who have
> > >> restrictions
> > >>> on
> > >>>> anything that says "experimental".
> > >>>>
> > >>>> I think the more important fact is it was a real burden on
> > development,
> > >>>> and there was no MSSQL champion in the Airflow maintainers.
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Andrey Anshin <andrey.ans...@taragol.is>
> > >>>> Sent: Thursday, May 30, 2024 2:39 PM
> > >>>> To: dev@airflow.apache.org
> > >>>> Cc: james.du...@improving.com.invalid
> > >>>> Subject: Re: [DISCUSS] Restore the SQL server backend
> > >>>>
> > >>>> There was a proposal to keep it in the past [1] with a short
> > >> explanation
> > >>>> why the maintainers did not want to keep it.
> > >>>>
> > >>>>> many Microsoft customers who are using Airflow
> > >>>>
> > >>>> Microsoft also supports and participates in the development of
> > >>> PostgreSQL,
> > >>>> there is one Core Team member and couple of Major Contributors
> working
> > >> in
> > >>>> Microsoft [2] and in addition a couple years ago Microsoft acquired
> > one
> > >>> of
> > >>>> the PostgreSQL vendors [3]. So I would like to believe that
> Microsoft
> > >>> also
> > >>>> could offer different services around PostgreSQL for their
> customers.
> > >>>>
> > >>>>
> > >>>> [1] Keep Mssql support:
> > >>>> https://lists.apache.org/thread/ot58ms069z4pyhj786j1m0dqds6lhjks
> > >>>> [2] PostgreSQL: Contributors Profiles:
> > >>>> https://www.postgresql.org/community/contributors/
> > >>>> [3] Microsoft Acquires Citus Data:
> > >>>>
> > >>
> > https://www.citusdata.com/blog/2019/01/24/microsoft-acquires-citus-data/
> > >>>>
> > >>>> On Thu, 30 May 2024 at 21:18, Pierre Jeambrun <
> pierrejb...@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>> I share Jed feeling. The effort required to maintain those compare
> to
> > >>>>> the value it actually brings combined with the usage from the
> survey,
> > >>>>> it doesn’t seem worth it to me.
> > >>>>>
> > >>>>> On Thu 30 May 2024 at 19:16, Jed Cunningham <
> > >> jedcunning...@apache.org>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Just for context, here were (roughly) the results from the 2023
> > >>>>>> Airflow
> > >>>>>> survey:
> > >>>>>>
> > >>>>>> PostgreSQL: 75%
> > >>>>>> MySQL: 15%
> > >>>>>> MSSQL: 5%
> > >>>>>>
> > >>>>>> Also, there are already discussions about potentially dropping
> > >> MySQL
> > >>>>>> support in Airflow 3. Given all that and the points from the past
> > >>>>>> vote, I don't think it makes much sense to bring MSSQL back.
> > >>>>>>
> > >>>>>
> > >>>> ________________________________
> > >>>> Strike Technologies, LLC (“Strike”) is part of the GTS family of
> > >>>> companies. Strike is a technology solutions provider, and is not a
> > >> broker
> > >>>> or dealer and does not transact any securities related business
> > >> directly
> > >>>> whatsoever. This communication is the property of Strike and its
> > >>>> affiliates, and does not constitute an offer to sell or the
> > >> solicitation
> > >>> of
> > >>>> an offer to buy any security in any jurisdiction. It is intended
> only
> > >> for
> > >>>> the person to whom it is addressed and may contain information that
> is
> > >>>> privileged, confidential, or otherwise protected from disclosure.
> > >>>> Distribution or copying of this communication, or the information
> > >>> contained
> > >>>> herein, by anyone other than the intended recipient is prohibited.
> If
> > >> you
> > >>>> have received this communication in error, please immediately notify
> > >>> Strike
> > >>>> at i...@striketechnologies.com, and delete and destroy any copies
> > >>> hereof.
> > >>>> ________________________________
> > >>>>
> > >>>> CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any
> > >> attachments
> > >>>> are intended solely for the addressee. This transmission is covered
> by
> > >>> the
> > >>>> Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The
> > >>>> information contained in this transmission is confidential in nature
> > >> and
> > >>>> protected from further use or disclosure under U.S. Pub. L. 106-102,
> > >> 113
> > >>>> U.S. Stat. 1338 (1999), and may be subject to attorney-client or
> other
> > >>>> legal privilege. Your use or disclosure of this information for any
> > >>> purpose
> > >>>> other than that intended by its transmittal is strictly prohibited,
> > and
> > >>> may
> > >>>> subject you to fines and/or penalties under federal and state law.
> If
> > >> you
> > >>>> are not the intended recipient of this transmission, please DESTROY
> > ALL
> > >>>> COPIES RECEIVED and confirm destruction to the sender via return
> > >>>> transmittal.
> > >>>>
> > >>>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> > Warning: The sender of this message could not be validated and may not be
> > the actual sender.
> >
>
Warning: The sender of this message could not be validated and may not be the 
actual sender.

Reply via email to