The cost of maintaining it in Airflow repo (with CI/CI, GH issues etc) is, unfortunately, just too much higher.
On Thu, 13 Jun 2024 at 17:11, James Duong <james.du...@improving.com.invalid> wrote: > Hi Jarek, > > Thanks for your response. > > I would prefer to make this work part of the main Airflow repository. > > My previous experience with maintaining forks of Apache projects has > always been that it fragments the community unnecessarily. Users of a > company-specific fork might ask for features (not just relating to MSSQL) > that could benefit the community at large. The larger community can miss > out on insights from the users and maintainers of the fork. > > There is a greater cost both getting updates from the main repo and > upstreaming improvements to the main repo as the codebase diverges further. > > I could see confusion from end users if there are multiple sets of > artifacts for different forks of Airflow about which one to get. > > What are your thoughts on this? > > From: Jarek Potiuk <ja...@potiuk.com> > Date: Monday, June 3, 2024 at 9:32 PM > To: dev@airflow.apache.org <dev@airflow.apache.org> > Cc: james.du...@improving.com.invalid <james.du...@improving.com.invalid> > Subject: Re: [DISCUSS] Restore the SQL server backend > I am not sure if you read what I wrote with full understanding. > > To be perfectly honest - If you secure enough resources, I think *STILL* it > will be better if you maintain your own fork and apply necessary changes > and offer it commercially to anyone who needs it. This is way easier for > the community, and better for you commercially - and if you are **really** > committed for a long term to do MSSQL, then you should have no problem in > maintaining the fork. > > On Mon, Jun 3, 2024 at 11:15 PM James Duong > <james.du...@improving.com.invalid> wrote: > > > Thanks for all of your feedback and discussion. > > > > The interest and usage from the enterprise MSSQL community is very large > - > > it's unfortunate that numbers are difficult to gather. > > > > In terms of the support - I hear you that it should not be limited to > only > > CI improvements and PR support and a more active role needs to be taken. > I > > am working on a plan that would provide the necessary involvement in the > > community. > > > > Please allow me some time to see what is possible. > > > > From: Wei Lee <weilee...@gmail.com> > > Date: Friday, May 31, 2024 at 8:45 AM > > To: dev@airflow.apache.org <dev@airflow.apache.org> > > Cc: james.du...@improving.com.invalid <james.du...@improving.com > .invalid> > > Subject: Re: [DISCUSS] Restore the SQL server backend > > I agree with Jed and the following comments. If my memory serves me > right, > > this topic has been discussed a few times in the past. 5% doesn't seem > very > > convincing. Even if it's biased, I'm still not persuaded that there are a > > large number of users that are worth the community's effort. And Jarek > > pointed out a great solution for forking Airflow and adding MSSQL support > > to it. > > > > Best, > > Wei > > > > > On May 31, 2024, at 7:50 PM, Elad Kalif <elad...@apache.org> wrote: > > > > > > I agree with Jarek > > > > > > I am a bit worried about the mental model of this proposal as you are > > > offering to deliver a feature but you are not offering being a > community > > > member. > > > I had a lot of frustration with the MsSQL backend tests, it really > caused > > > me pain as a contributor. According to your mental model - will you > > > actively review community PRs, triage Airflow issues and offer guidance > > and > > > help when needed about MsSQL or will the maintainers have to track > these > > > problems and actively tag you/your team for assistance? > > > > > > Let me give an example: User opens a Github issue about HA scheduler. > > Will > > > your team participate in the issue triage? Or do you expect the > community > > > to triage the issue and only after some discussion when it turns out > that > > > it's MsSQL specific issue then we need to notify you? > > > > > > On Fri, May 31, 2024 at 10:05 AM Jarek Potiuk <ja...@potiuk.com> > wrote: > > > > > >>> We also understand and are ready to address the concerns stated in > the > > >> vote about support and resolving CI issues > > >> > > >> Hello James, > > >> > > >> Could you please explain how exactly are you planning to help a number > > of > > >> maintainers who are working on developing new feature to make sure > > >> they know and realise unobvious consequences of some of the DB changes > > they > > >> might have when some of the features of MYSQL are causing - for > example > > >> heavy slowdown of inserts because of rebalancing B-TREES on UUID > index > > for > > >> databases (that unlike Postgres and MariaDB) lack native UUID support > > (see > > >> . How would you help with discovering similar type of issues see here > > >> https://lists.apache.org/thread/7235o1bc3w4694sw8q9m4p58g3tdcjj7 > > >> > > >> Could you please explain how many people, effort and dedicated > resources > > >> (i.e. continuous testing of stability and performance you are going to > > >> spend on fixing those)? > > >> > > >> IMHO. If you see a LOT of users that want MsSQL support - you are > > >> absolutely free to spend those money, effort and resources on making a > > fork > > >> of Airflow with MsSQL support and charge a premium for that (and a > large > > >> one). That seems like a very good business model to make if you see a > > lot > > >> of interest there. > > >> > > >> This is all perfectly fine according to our licence and community > would > > be > > >> really thankful for someone who would take the burden of maintaining > > MSSQL > > >> while also making it possible for MSSQL users. Maybe that's the way to > > go > > >> for you? > > >> > > >> J, > > >> > > >> > > >> > > >> On Fri, May 31, 2024 at 8:32 AM James Duong > > >> <james.du...@improving.com.invalid> wrote: > > >> > > >>> Many of the MSSQL customers using Airflow with MSSQL as the backend > are > > >>> unlikely to participate in those types of surveys, unfortunately, so > I > > >> fear > > >>> the numbers are biased. We have had direct feedback from multiple > very > > >>> large MSSQL customers who see the removal of this support as a large > > >>> blocker to using Airflow. > > >>> > > >>> Although yes, Microsoft does support PostgreSQL (and MySQL), MSSQL is > > an > > >>> extremely widely used and popular Database platform across different > > >>> segments whether Enterprise, Government, Major or SMC. Various > Oracle, > > >> IBM > > >>> and OSS customers are diversifying their Database platform with SQL > and > > >> it > > >>> is important for Airflow-type products to support SQL. > > >>> > > >>> We also understand and are ready to address the concerns stated in > the > > >>> vote about support and resolving CI issues. > > >>> > > >>> From: Jarek Potiuk <ja...@potiuk.com> > > >>> Date: Thursday, May 30, 2024 at 3:47 PM > > >>> To: dev@airflow.apache.org <dev@airflow.apache.org> > > >>> Subject: Re: [DISCUSS] Restore the SQL server backend > > >>> Agree with all comments above. Also I think bringing MySQL back is > > going > > >> to > > >>> make it way more complex to implement some of the improvements we > > thought > > >>> about - mostly async DB operations (only recently - November 2023 > async > > >>> support has been added to MSSQL and we know from the history that > MSSQL > > >>> gave us a lot of headache while developing it and there is no reason > to > > >>> believe it will be different. And "helping in CI" is not going to cut > > it > > >> - > > >>> we need every maintainer who wants to implement a new DB change to > > become > > >>> expert on what is different in MSSQL. > > >>> > > >>> Honestly - if I'd lose 5% of users because their internal rules say > > >>> MSSQL-only (and no Postgres, which as mentioned above is widely > > supported > > >>> and popular including Azure) at the expense of better performance, > less > > >>> resource usage (as we expect with asyncio) delivered faster to > > remaining > > >>> 95% users, then I know what my decision is. > > >>> > > >>> BTW. That's not really a criteria we use for such decisions about > > >>> technology, but unlike Amazon and Google, Microsoft Azure Data > Factory > > >>> Airflow team is generally absent from any of those discussions we > have > > >>> here. Despite us reaching out in various ways they have never "Shown" > > >> here, > > >>> never contributed anything (or at least we have no knowledge about > it) > > - > > >>> including contributions, improvements, system tests nor any other > > >>> activities in the community. They are simply not giving back to the > > >>> community., > > >>> > > >>> If they did and officially said (and had proven as the Amazon and > > Google > > >>> team did multiple times for their integrations) that they are willing > > to > > >>> support and maintain MSSQL DB, maybe we would reconsider - mostly > > because > > >>> we could have counted on having them step in when needed (again - as > it > > >>> happened multiple times with Amazon and Google - when we reach out > and > > >> need > > >>> their help we know we can count on it). I don't see a particular > reason > > >> why > > >>> we should support their proprietary technology. > > >>> > > >>> J. > > >>> > > >>> > > >>> On Fri, May 31, 2024 at 12:16 AM Damian Shaw < > > >> ds...@striketechnologies.com > > >>>> > > >>> wrote: > > >>> > > >>>> I would say that MSSQL was often marked as "experimental" ( > > >>>> > > >>> > > >> > > > https://airflow.apache.org/docs/apache-airflow/2.6.0/howto/set-up-database.html > > >>> ), > > >>>> so IMO I don't think the evidence of it only being used by 5% is > > >>>> particularly convincing that it wouldn't eventually be popular. > Users > > >> who > > >>>> might want to primarily use MSSQL because of internal corporate > > >>>> restrictions might have a large overlap with users who have > > >> restrictions > > >>> on > > >>>> anything that says "experimental". > > >>>> > > >>>> I think the more important fact is it was a real burden on > > development, > > >>>> and there was no MSSQL champion in the Airflow maintainers. > > >>>> > > >>>> -----Original Message----- > > >>>> From: Andrey Anshin <andrey.ans...@taragol.is> > > >>>> Sent: Thursday, May 30, 2024 2:39 PM > > >>>> To: dev@airflow.apache.org > > >>>> Cc: james.du...@improving.com.invalid > > >>>> Subject: Re: [DISCUSS] Restore the SQL server backend > > >>>> > > >>>> There was a proposal to keep it in the past [1] with a short > > >> explanation > > >>>> why the maintainers did not want to keep it. > > >>>> > > >>>>> many Microsoft customers who are using Airflow > > >>>> > > >>>> Microsoft also supports and participates in the development of > > >>> PostgreSQL, > > >>>> there is one Core Team member and couple of Major Contributors > working > > >> in > > >>>> Microsoft [2] and in addition a couple years ago Microsoft acquired > > one > > >>> of > > >>>> the PostgreSQL vendors [3]. So I would like to believe that > Microsoft > > >>> also > > >>>> could offer different services around PostgreSQL for their > customers. > > >>>> > > >>>> > > >>>> [1] Keep Mssql support: > > >>>> https://lists.apache.org/thread/ot58ms069z4pyhj786j1m0dqds6lhjks > > >>>> [2] PostgreSQL: Contributors Profiles: > > >>>> https://www.postgresql.org/community/contributors/ > > >>>> [3] Microsoft Acquires Citus Data: > > >>>> > > >> > > https://www.citusdata.com/blog/2019/01/24/microsoft-acquires-citus-data/ > > >>>> > > >>>> On Thu, 30 May 2024 at 21:18, Pierre Jeambrun < > pierrejb...@gmail.com> > > >>>> wrote: > > >>>> > > >>>>> I share Jed feeling. The effort required to maintain those compare > to > > >>>>> the value it actually brings combined with the usage from the > survey, > > >>>>> it doesn’t seem worth it to me. > > >>>>> > > >>>>> On Thu 30 May 2024 at 19:16, Jed Cunningham < > > >> jedcunning...@apache.org> > > >>>>> wrote: > > >>>>> > > >>>>>> Just for context, here were (roughly) the results from the 2023 > > >>>>>> Airflow > > >>>>>> survey: > > >>>>>> > > >>>>>> PostgreSQL: 75% > > >>>>>> MySQL: 15% > > >>>>>> MSSQL: 5% > > >>>>>> > > >>>>>> Also, there are already discussions about potentially dropping > > >> MySQL > > >>>>>> support in Airflow 3. Given all that and the points from the past > > >>>>>> vote, I don't think it makes much sense to bring MSSQL back. > > >>>>>> > > >>>>> > > >>>> ________________________________ > > >>>> Strike Technologies, LLC (“Strike”) is part of the GTS family of > > >>>> companies. Strike is a technology solutions provider, and is not a > > >> broker > > >>>> or dealer and does not transact any securities related business > > >> directly > > >>>> whatsoever. This communication is the property of Strike and its > > >>>> affiliates, and does not constitute an offer to sell or the > > >> solicitation > > >>> of > > >>>> an offer to buy any security in any jurisdiction. It is intended > only > > >> for > > >>>> the person to whom it is addressed and may contain information that > is > > >>>> privileged, confidential, or otherwise protected from disclosure. > > >>>> Distribution or copying of this communication, or the information > > >>> contained > > >>>> herein, by anyone other than the intended recipient is prohibited. > If > > >> you > > >>>> have received this communication in error, please immediately notify > > >>> Strike > > >>>> at i...@striketechnologies.com, and delete and destroy any copies > > >>> hereof. > > >>>> ________________________________ > > >>>> > > >>>> CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any > > >> attachments > > >>>> are intended solely for the addressee. This transmission is covered > by > > >>> the > > >>>> Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The > > >>>> information contained in this transmission is confidential in nature > > >> and > > >>>> protected from further use or disclosure under U.S. Pub. L. 106-102, > > >> 113 > > >>>> U.S. Stat. 1338 (1999), and may be subject to attorney-client or > other > > >>>> legal privilege. Your use or disclosure of this information for any > > >>> purpose > > >>>> other than that intended by its transmittal is strictly prohibited, > > and > > >>> may > > >>>> subject you to fines and/or penalties under federal and state law. > If > > >> you > > >>>> are not the intended recipient of this transmission, please DESTROY > > ALL > > >>>> COPIES RECEIVED and confirm destruction to the sender via return > > >>>> transmittal. > > >>>> > > >>> > > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > Warning: The sender of this message could not be validated and may not be > > the actual sender. > > >