I added you, Yulei,
I have no time to look at details, but I have two big concerns about this - regarding Audience and Security (first concern) and whether we want to do it all (second concern). First about security and audience: This is against the current Security Model of Airflow: https://airflow.apache.org/docs/apache-airflow/stable/security/security_model.html where the users of Airflow UI are a completely distinct group of people than DAG authors. And all our efforts (including the latest AIP-56 - https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-56+Extensible+user+management assume that this is the case and that DAG authors are not necessarily (and often different user groups) that use Airflow UI to manage DAG execution and maintenance. I personally think putting DAG authoring into airflow "webserver" is no go and I would be voting against it. IMHO the Airflow UI is for maintenance and monitoring of existing DAGS, NOT for DAG authoring. IMHO there is no particular reason why these two should be tightly integrated as a single "Airflow webserver" component. I event do not know if we really want to maintain a singe "UI driven" DAG authoring experience in the Airflow community, I think Airflow provides a very good "base" for anyone to develop and release their own Visual DAG authoring solution - and many stakeholders and users of Airflow do so already. You can develop any solution that produces Python DAG files and store them in Airflow DAG folder - there is nothing to prevent anyone from doing it. Maybe it would be worthwhile to expose some APIs from airflow to make it easier and more "native" (for example have an API to let Airflow Scheduler/DAG file processor know to prioritise, or force parsing of files placed int the DAG folder, but I do not see a particular reason (or actually I see a lot of security-driven reasons why not to do it) to have the Python files coming through Airflow Webserver (either generated by it or submitted to it). In Airflow 2 we deliberately isolated webserver from DAGs folder for that particular reason - security and we've done it not only for "write" access - we only removed the need of "read" access and (this is how Airflow 2 works today) Airflow webserver SHOULD NOT have the dags folder available to the web server component AT ALL. Not even as an option. It just SHOULD NOT happen. Airflow Webserver should run in isolated environment where the only thing it has access to is the Airflow DB. Full stop. This proposal goes in a completely opposite direction (if I read it right) - it expects the airflow webserver to have acess to the DAGs folder (or Git or whatever the "source" of DAG folder is). For me, from the security point of view is a plain NO. This violates a number of assumptions and principles we took when we defined, published and agreed on our Security Model, it opens a floodgate of potential security issues that we simply do not have to deal with currently and as such - I'd say it's never going to happen (this way). Of course - it could be a separate component - not connected to airflow webserver, exposing it's own completely different interface. It could be somewhat connected to airflow UI - for example, I easily imagine that DAGs displayed in Airflow UI have links (provided via a plugin or API) that might take the user of Airflow UI to another server - deep-linking to the the exact page where you can edit and submit DAGs. As long as it is completely outside of Airflow UI, with different authentication, security control etc. - this is a perfectly acceptable solution for me. However this is a different component (I'd call it "DAG authoring" one) and I am not at all sure if we want to implement and maintain such a component in the Airflow community. This is a huge, new component to maintain, which requires a lot of UI expertise (much more interactive than the current UI of Airflow is) and I think accepting it by the community to maintain it is a huge obligation - I think currently we are not capable of doing it. Second concern - do we want to have such separate component at all Assuming that the security concern of mine is addressed (this will be a very hard NO from me if it is not) the second concern is about whether we want such a separate DAG authoring component to be maintained and released by the Airflow community. I believe- quite possibly we even do not WANT to do it. I kind of like the situation that Airflow allows and exposes the APIs where 3rd-parties (like Pinterest) might want to plug-in those APIs and release and manage their own solutions for that. And there might be more than that - there are already at least a few solutions out there that allow that - some of them are open-source and available to everyone, some of them available as a service - accompanying Managed installations of Airflow. And I see no particular reason why every single Airflow user out there should stick to a single "airflow community blessed" solution. Such solutions are necessarily opinionated - you cannot have a fully-generic-ui-driven solution that will match the flexibility of Python DAG authoring. This is simply impossible by definition - this is actually why Airflow's DAG authoring is Python not Yaml or any other declarative way - because of the flexibility and un-opinionated way of how you can build your DAG authoring setup. And I like how different solutions provide different users different "opinions" on how they can author DAGs visually. I do not feel particular need by the community to choose and bless a single solution to be the "chosen one" and even less so to take the burden of developing and maintaining it. So - before you spend your time on submitting the proposal - I would strongly advise to discuss here if this is something that we - as a community - want AT ALL. If you ask me - the discussion should be maybe "how we can make Airflow better suited for anyone who would like to build such a solution". Maybe we should figure out the set of APIs that are needed to make it easier? Maybe we should document how this can be done in our documentation? Maybe it calls for a more prominent "UI DAG authoring solutions" section in the "Airflow Ecosystem" page https://airflow.apache.org/ecosystem/ where we could explicitly list solutions that allow for visual DAG authoring - and add a solution that companies like Pinterest can publish, release and maintain, to publish links there - to make easier discoverable by users who are looking for such solutions. This is not a hard NO. If we have people in our community who will think and show commitment that they are willing and happy to manage and release such a component, and if we - as community - want to manage it and that there is a value in such a component, It will not be a hard NO from me. It will be "I think we do not need it and have no capacity and that it will slow us down a lot if we do". But I will not veto it if there is overwhelming majority of others who would vote for it and express interest in maintaining it. So my proposal - is to turn that discussion into "do we really want that ? and what are the alternatives?". Of course - it's only my personal opinion, I guess others might have a different one here. But I am curious what others thing on both accounts: 1) Security aspect 2) Do we want a visual DAG authoring in the community at all ? J. On Wed, Nov 15, 2023 at 8:46 PM Yulei Li <yule...@pinterest.com.invalid> wrote: > Hi Ash, > > Missed your email, unfortunately my organization disabled the Google Doc > option to share the Doc Link that is readable to everyone, so I tried to > grant the read access to the "dev@airflow.apache.org" email alias but it > did not work. > > I have granted commenter permissions to all the requests that I have > received. But if it makes it easier, I can convert the Doc into an AIP > < > https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals > >, > just need to request edit access to it. And my Wiki ID is: *yuleili*. > Thanks > > Best, > Yulei > > On Fri, Nov 10, 2023 at 4:16 AM Ash Berlin-Taylor <a...@apache.org> wrote: > > > Please make the document readable by anyone with the link. > > > > Thanks, > > Ash > > > > > On 9 Nov 2023, at 21:10, Yulei Li <yule...@pinterest.com.INVALID> > wrote: > > > > > > Hi Airflow community, > > > > > > My name is Yulei and I'm a software engineer from the Workflow Platform > > > team at Pinterest. On behalf of the Workflow Platform team and the > > > Analytics Platform team at Pinterest, I would like to raise an AIP to > the > > > open source community to discuss the proposal to build a "UI DAG > > Composer" > > > into Airflow. > > > > > > At Pinterest, we built our internal workflow system on top of Airflow. > > And > > > over the past few years, we have done intensive development on the > system > > > and built customizations to meet the workflow orchestration > requirements > > of > > > our internal customers. > > > > > > Currently, the two teams are working on building a feature that brings > > the > > > "UI-based" workflow composing experience into our "internal Airflow" > > > system. The motivation of this project is: > > > > > > - Enable our internal customers who might not be familiar with coding > > to > > > be able to use Airflow. > > > - Consolidate our internal systems to simplify the product offerings. > > > > > > Given that, the goal of the project is to build an “UI Composer” into > the > > > Airflow system and expose it from the Airflow UI. Users can then > utilize > > it > > > to build their workflow without the need to know the underlying Airflow > > DAG > > > domain specific language (DSL). > > > > > > As we are working on this project internally, we would also like to > > propose > > > this feature to the Airflow community and discuss the potential of > > > contributing it back to open source, since we believe this feature can > be > > > beneficial to other users/organizations as well. > > > > > > You can find the details of the proposal in this Google doc: [AIP > > Proposal] > > > Airflow UI DAG Composer > > > < > > > https://docs.google.com/document/d/1KGrLj1vSmtsNXRyj909xO8-niB1s5db35bPuM7FQuXc/edit#heading=h.mnaz328tvctz > > > > > > (please > > > request access if you hit any access issue with the doc). We would love > > to > > > hear your inputs/feedback, thanks in advance. > > > > > > Best, > > > Yulei > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >