I added you, Yulei,

I have no time to look at details, but I have two big concerns about this -
regarding Audience and Security (first concern) and whether we want to do
it all (second concern).

First about security and audience:

This is against the current Security Model of Airflow:
https://airflow.apache.org/docs/apache-airflow/stable/security/security_model.html
where the users of Airflow UI are a completely distinct group of
people than DAG authors. And all our efforts (including the latest AIP-56 -
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-56+Extensible+user+management
assume that this is the case and that DAG authors are not necessarily (and
often different user groups) that use Airflow UI to manage DAG execution
and maintenance.

I personally think putting DAG authoring into airflow "webserver" is no go
and I would be voting against it. IMHO the Airflow UI is for maintenance
and monitoring of existing DAGS, NOT for DAG authoring. IMHO there is no
particular reason why these two should be tightly integrated as a single
"Airflow webserver" component. I event do not know if we really want to
maintain a singe "UI driven" DAG authoring experience in the Airflow
community, I think Airflow provides a very good "base" for anyone to
develop and release their own Visual DAG authoring solution - and many
stakeholders and users of Airflow do so already. You can develop any
solution that produces Python DAG files and store them in Airflow DAG
folder - there is nothing to prevent anyone from doing it. Maybe it would
be worthwhile to expose some APIs from airflow to make it easier  and more
"native" (for example have an API to let Airflow Scheduler/DAG file
processor know to prioritise, or force parsing of files placed int the DAG
folder, but I do not see a particular reason (or actually I see a lot of
security-driven reasons why not to do it) to have the Python files coming
through Airflow Webserver (either generated by it or submitted to it).

In Airflow 2 we deliberately isolated webserver from DAGs folder for that
particular reason - security and we've done it not only for "write" access
- we only removed the need of "read" access and (this is how Airflow 2
works today) Airflow webserver SHOULD NOT have the dags folder available to
the web server component AT ALL. Not even as an option. It just SHOULD
NOT happen. Airflow Webserver should run in isolated environment where the
only thing it has access to is the Airflow DB. Full stop.

This proposal goes in a completely opposite direction (if I read it right)
- it expects the airflow webserver to have acess to the DAGs folder (or Git
or whatever the "source" of DAG folder is). For me, from the security point
of view is a plain NO. This violates a number of assumptions and principles
we took when we defined, published and agreed on our Security Model, it
opens a floodgate of potential security issues that we simply do not have
to deal with currently and as such - I'd say it's never going to happen
(this way).

Of course - it could be a separate component - not connected to airflow
webserver, exposing it's own completely different interface. It could be
somewhat connected to airflow UI - for example, I easily imagine that DAGs
displayed in Airflow UI have links (provided via a plugin or API) that
might take the user of Airflow UI to another server - deep-linking to the
the exact page where you can edit and submit DAGs. As long as it is
completely outside of Airflow UI, with different authentication, security
control etc. - this is a perfectly acceptable solution for me. However this
is a different component (I'd call it "DAG authoring" one) and I am not at
all sure if we want to implement and maintain such a component in the
Airflow community. This is a huge, new component to maintain, which
requires a lot of UI expertise (much more interactive than the current UI
of Airflow is) and I think accepting it by the community to maintain it is
a huge obligation - I think currently we are not capable of doing it.


Second concern - do we want to have such separate component at all

Assuming that the security concern of mine is addressed (this will be a
very hard NO from me if it is not) the second concern is about whether we
want such a separate DAG authoring component to be maintained and released
by the Airflow community.

I believe- quite  possibly we even do not WANT to do it. I kind of like the
situation that Airflow allows and exposes the APIs where 3rd-parties (like
Pinterest) might want to plug-in those APIs and release and manage their
own solutions for that. And there might be more than that - there are
already at least a few solutions out there that allow that - some of them
are open-source and available to everyone, some of them available as a
service - accompanying Managed  installations of Airflow. And I see no
particular reason why every single Airflow user out there should stick to a
single "airflow community blessed" solution. Such solutions are necessarily
opinionated - you cannot have a fully-generic-ui-driven solution that will
match the flexibility of Python DAG authoring. This is simply impossible by
definition - this is actually why Airflow's DAG authoring is Python not
Yaml or any other declarative way - because of the flexibility and
un-opinionated way of how you can build your DAG authoring setup. And I
like how different solutions provide different users different "opinions"
on how they can author DAGs visually. I do not feel particular need by the
community to choose and bless a single solution to be the "chosen one" and
even less so to take the burden of developing and maintaining it.

So - before you spend your time on submitting the proposal - I would
strongly advise to discuss here if this is something that we - as a
community - want AT ALL.

If you ask me - the discussion should be maybe "how we can make
Airflow better suited for anyone who would like to build such a solution".
Maybe we should figure out the set of APIs that are needed to make it
easier? Maybe we should document how this can be done in our documentation?
Maybe it calls for a more prominent "UI DAG authoring solutions" section in
the "Airflow Ecosystem" page https://airflow.apache.org/ecosystem/ where we
could explicitly list solutions that allow for visual DAG authoring - and
add a solution that companies like Pinterest can publish, release and
maintain, to publish links there - to make easier discoverable by users who
are looking for such solutions.

This is  not a hard NO. If we have people in our community who will think
and show commitment that they are willing and happy to manage and release
such a component, and if we - as community  - want to manage it and that
there is a value in such a component, It will not be a hard NO from me. It
will be "I think we do not need it and have no capacity and that it will
slow us down a lot if we do". But I will not veto it if there is
overwhelming majority of others who would vote for it and express interest
in maintaining it.

So my proposal - is to turn that discussion into "do we really want that ?
and what are the alternatives?".

Of course - it's only my personal opinion, I guess others might have a
different one here. But I am curious what others thing on both accounts:

1) Security aspect

2) Do we want a visual DAG authoring in the community at all ?


J.


On Wed, Nov 15, 2023 at 8:46 PM Yulei Li <yule...@pinterest.com.invalid>
wrote:

> Hi Ash,
>
> Missed your email, unfortunately my organization disabled the Google Doc
> option to share the Doc Link that is readable to everyone, so I tried to
> grant the read access to the "dev@airflow.apache.org" email alias but it
> did not work.
>
> I have granted commenter permissions to all the requests that I have
> received. But if it makes it easier, I can convert the Doc into an AIP
> <
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals
> >,
> just need to request edit access to it. And my Wiki ID is: *yuleili*.
> Thanks
>
> Best,
> Yulei
>
> On Fri, Nov 10, 2023 at 4:16 AM Ash Berlin-Taylor <a...@apache.org> wrote:
>
> > Please make the document readable by anyone with the link.
> >
> > Thanks,
> > Ash
> >
> > > On 9 Nov 2023, at 21:10, Yulei Li <yule...@pinterest.com.INVALID>
> wrote:
> > >
> > > Hi Airflow community,
> > >
> > > My name is Yulei and I'm a software engineer from the Workflow Platform
> > > team at Pinterest. On behalf of the Workflow Platform team and the
> > > Analytics Platform team at Pinterest, I would like to raise an AIP to
> the
> > > open source community to discuss the proposal to build a "UI DAG
> > Composer"
> > > into Airflow.
> > >
> > > At Pinterest, we built our internal workflow system on top of Airflow.
> > And
> > > over the past few years, we have done intensive development on the
> system
> > > and built customizations to meet the workflow orchestration
> requirements
> > of
> > > our internal customers.
> > >
> > > Currently, the two teams are working on building a feature that brings
> > the
> > > "UI-based" workflow composing experience into our "internal Airflow"
> > > system. The motivation of this project is:
> > >
> > >   - Enable our internal customers who might not be familiar with coding
> > to
> > >   be able to use Airflow.
> > >   - Consolidate our internal systems to simplify the product offerings.
> > >
> > > Given that, the goal of the project is to build an “UI Composer” into
> the
> > > Airflow system and expose it from the Airflow UI. Users can then
> utilize
> > it
> > > to build their workflow without the need to know the underlying Airflow
> > DAG
> > > domain specific language (DSL).
> > >
> > > As we are working on this project internally, we would also like to
> > propose
> > > this feature to the Airflow community and discuss the potential of
> > > contributing it back to open source, since we believe this feature can
> be
> > > beneficial to other users/organizations as well.
> > >
> > > You can find the details of the proposal in this Google doc: [AIP
> > Proposal]
> > > Airflow UI DAG Composer
> > > <
> >
> https://docs.google.com/document/d/1KGrLj1vSmtsNXRyj909xO8-niB1s5db35bPuM7FQuXc/edit#heading=h.mnaz328tvctz
> > >
> > > (please
> > > request access if you hit any access issue with the doc). We would love
> > to
> > > hear your inputs/feedback, thanks in advance.
> > >
> > > Best,
> > > Yulei
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to