pankajkoti commented on code in PR #36022: URL: https://github.com/apache/airflow/pull/36022#discussion_r1412798224
########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed Review Comment: ```suggestion DAG authors are able to submit code - via Python files placed in the DAGS_FOLDER - that will be executed ``` ########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed +in a number of circumstances. The code to execute is not verified nor checked nor sand-boxed by Airflow +(that would be very difficult if not impossible to do), so effectively DAG authors can execute arbitrary +code on the workers (part of Celery Workers for Celery Executor, local processes run by scheduler in case +of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the DAG File Processor +(which can be either executed as standalone process or can be part of the Scheduler) and in the Triggerer. + +There are several consequences of this model chosen by Airflow, that deployment managers need to be aware of: + +* In case of Local Executor and DAG File Processor running as part of the Scheduler, DAG authors can execute + arbitrary code on the machine where scheduler is running. This means that they can affect the scheduler + process itself, and potentially affect the whole Airflow installation - including modifying cluster-wide + policies and changing Airflow configuration . If you are running Airflow with one of those settings, + the Deployment Manager must trust the DAG authors not to abuse this capability. + +* In case of Celery Executor, DAG authors can execute arbitrary code on the Celery Workers. This means that + they can potentially influence all the task executed on the same worker. If you are running Airflow with + Celery Executor, the Deployment Manager must trust the DAG authors not to abuse this capability and unless + Deployment Manager separates task execution by queues by Cluster Policies, they should assume, there is no + isolation between tasks. + +* In case of Kubernetes Executor, DAG authors can execute arbitrary code on the Kubernetes POD they run. Each + task is executed in a separate POD, so there is already isolation between tasks as generally speaking + Kubernetes provides isolation between PODs. + +* In case of Triggerer, DAG authors can execute arbitrary code in Triggerer. Currently there are no + enforcement mechanisms that would allow to isolate tasks that are using deferrable functionality from + each other and arbitrary code from various tasks can be executed in the same process/machine. Deployment + Manager must trust that DAG authors will not abuse this capability. + +* The Deployment Manager might isolate the code execution provided by DAG authors - particularly in Review Comment: This point is really helpful 💯 . I am thinking if it would make sense to move this list item beneath. The list items above are explaining capabilities of DAG authors, there's also one list item below this one explaining more capabilities. However this one is more like a helpful suggestion for the deployment manager. And there's one more suggestion at the end which tells about adding tooling to review. So perhaps we could move these 2 suggestions at the end of the list after explaining the capabilities of DAG Authors. ########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed +in a number of circumstances. The code to execute is not verified nor checked nor sand-boxed by Airflow +(that would be very difficult if not impossible to do), so effectively DAG authors can execute arbitrary +code on the workers (part of Celery Workers for Celery Executor, local processes run by scheduler in case +of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the DAG File Processor +(which can be either executed as standalone process or can be part of the Scheduler) and in the Triggerer. + +There are several consequences of this model chosen by Airflow, that deployment managers need to be aware of: + +* In case of Local Executor and DAG File Processor running as part of the Scheduler, DAG authors can execute + arbitrary code on the machine where scheduler is running. This means that they can affect the scheduler + process itself, and potentially affect the whole Airflow installation - including modifying cluster-wide + policies and changing Airflow configuration . If you are running Airflow with one of those settings, + the Deployment Manager must trust the DAG authors not to abuse this capability. + +* In case of Celery Executor, DAG authors can execute arbitrary code on the Celery Workers. This means that + they can potentially influence all the task executed on the same worker. If you are running Airflow with + Celery Executor, the Deployment Manager must trust the DAG authors not to abuse this capability and unless + Deployment Manager separates task execution by queues by Cluster Policies, they should assume, there is no + isolation between tasks. + +* In case of Kubernetes Executor, DAG authors can execute arbitrary code on the Kubernetes POD they run. Each + task is executed in a separate POD, so there is already isolation between tasks as generally speaking + Kubernetes provides isolation between PODs. + +* In case of Triggerer, DAG authors can execute arbitrary code in Triggerer. Currently there are no + enforcement mechanisms that would allow to isolate tasks that are using deferrable functionality from + each other and arbitrary code from various tasks can be executed in the same process/machine. Deployment + Manager must trust that DAG authors will not abuse this capability. + +* The Deployment Manager might isolate the code execution provided by DAG authors - particularly in Review Comment: This point is really helpful 💯 . I am thinking if it would make sense to move this list item beneath. The list items above are explaining capabilities of DAG authors, there's also one list item below this one explaining more capabilities. However this one is more like a helpful suggestion for the deployment manager. And there's one more suggestion at the end which tells about adding tooling to review. So perhaps we could move these 2 suggestions at the end of the list after explaining the capabilities of DAG Authors. ########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed +in a number of circumstances. The code to execute is not verified nor checked nor sand-boxed by Airflow Review Comment: ```suggestion in a number of circumstances. The code to execute is neither verified, checked nor sand-boxed by Airflow ``` or could be simply "Airflow does not verify, check or sandbox the code to be executed". ########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed +in a number of circumstances. The code to execute is not verified nor checked nor sand-boxed by Airflow +(that would be very difficult if not impossible to do), so effectively DAG authors can execute arbitrary +code on the workers (part of Celery Workers for Celery Executor, local processes run by scheduler in case +of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the DAG File Processor +(which can be either executed as standalone process or can be part of the Scheduler) and in the Triggerer. + +There are several consequences of this model chosen by Airflow, that deployment managers need to be aware of: + +* In case of Local Executor and DAG File Processor running as part of the Scheduler, DAG authors can execute + arbitrary code on the machine where scheduler is running. This means that they can affect the scheduler + process itself, and potentially affect the whole Airflow installation - including modifying cluster-wide + policies and changing Airflow configuration . If you are running Airflow with one of those settings, + the Deployment Manager must trust the DAG authors not to abuse this capability. + +* In case of Celery Executor, DAG authors can execute arbitrary code on the Celery Workers. This means that + they can potentially influence all the task executed on the same worker. If you are running Airflow with Review Comment: ```suggestion they can potentially influence all the tasks executed on the same worker. If you are running Airflow with ``` ########## docs/apache-airflow/security/security_model.rst: ########## @@ -97,10 +97,74 @@ capabilities authenticated users may have: For more information on the capabilities of authenticated UI users, see :doc:`/security/access-control`. +Capabilities of DAG Authors +--------------------------- + +DAG authors are able to submit code - via Python files placed in the DAG_FOLDER - that will be executed +in a number of circumstances. The code to execute is not verified nor checked nor sand-boxed by Airflow +(that would be very difficult if not impossible to do), so effectively DAG authors can execute arbitrary +code on the workers (part of Celery Workers for Celery Executor, local processes run by scheduler in case +of Local Executor, Task Kubernetes POD in case of Kubernetes Executor), in the DAG File Processor +(which can be either executed as standalone process or can be part of the Scheduler) and in the Triggerer. + +There are several consequences of this model chosen by Airflow, that deployment managers need to be aware of: + +* In case of Local Executor and DAG File Processor running as part of the Scheduler, DAG authors can execute + arbitrary code on the machine where scheduler is running. This means that they can affect the scheduler + process itself, and potentially affect the whole Airflow installation - including modifying cluster-wide + policies and changing Airflow configuration . If you are running Airflow with one of those settings, Review Comment: ```suggestion policies and changing Airflow configuration. If you are running Airflow with one of those settings, ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org