Re: [PR] Add the section describing the security model of DAG Author capabilities [airflow]

via GitHub Sat, 02 Dec 2023 04:45:18 -0800


potiuk commented on code in PR #36022:
URL: https://github.com/apache/airflow/pull/36022#discussion_r1412794928



##########
docs/apache-airflow/security/security_model.rst:
##########
@@ -97,10 +97,74 @@ capabilities authenticated users may have:
 
 For more information on the capabilities of authenticated UI users, see 
:doc:`/security/access-control`.
 
+Capabilities of DAG Authors
+---------------------------
+
+DAG authors are able to submit code - via Python files placed in the 
DAG_FOLDER - that will be executed
+in a number of circumstances. The code to execute is not verified nor checked 
nor sandboxed by Airflow
+(that would be very difficult if not impossible to do), so effectively DAG 
authors can execute arbitrary
+code on the workers (part of Celery Workers for Celery Executor, local 
processes run by scheduler in case
+of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the 
DAG File Processor
+(which can be either executed as standalone process or can be part of the 
Scheduler) and in the Triggerer.
+
+There are several consequences of this model chosen by Airflow, that 
deployment managers need to be aware of:
+
+* In case of Local Executor and DAG File Processor running as part of the 
Scheduler, DAG authors can execute
+  arbitrary code on the machine where scheduler is running. This means that 
they can affect the scheduler
+  process itself, and potentially affect the whole Airflow installation - 
including modifying cluster-wide
+  policies and changing Airflow configuration . If you are running Airflow 
with one of those settings,
+  the Deployment Manager must trust the DAG authors not to abuse this 
capability.
+
+* In case of Celery Executor, DAG authors can execute arbitrary code on the 
Celery Workers. This means that
+  they can potentially influence all the task executed on the same worker. If 
you are running Airflow with
+  Celery Executor, the Deployment Manager must trust the DAG authors not to 
abuse this capability and unless
+  Deployment Manager separates task execution by queues by Cluster Policies, 
they should assume, there is no
+  isolation between tasks.
+
+* In case of Kubernetes Executor, DAG authors can execute arbitrary code on 
the Kubernetes POD they run. Each
+  task is executed in a separate POD, so there is already isolation between 
tasks as generally speaking
+  Kubernetes provides isolation between PODs.
+
+* In case of Triggerer, DAG authors can execute arbitrary code in Triggerer. 
Currently there are no
+  enforcement mechanisms that would allow to isolate tasks that are using 
deferrable functionality from
+  each other and arbitrary code from various tasks can be executed in the same 
process/machine. Deployment
+  Manager must trust that DAG authors will not abuse this capability.
+
+* The Deployment Manager might isolate the code execution provided by DAG 
authors - particularly in
+  Scheduler and Webserver by making sure that the Scheduler and Webserver 
don't even
+  have access to the DAG Files (that requires standalone DAG File Processor to 
be deployed). Generally
+  speaking - no DAG author provided code should ever be executed in the 
Scheduler or Webserver process.
+
+* There are a number of functionalities that allow the DAG author to point out 
the code to be executed in
+  scheduler or webserver process - for example they can choose custom 
Timetables, UI plugins, Connection UI

Review Comment:
   Perfect :) . I was just asking in the original PR 
https://github.com/apache/airflow/pull/35210



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Add the section describing the security model of DAG Author capabilities [airflow]

Reply via email to