potiuk commented on PR #35210:
URL: https://github.com/apache/airflow/pull/35210#issuecomment-1830836013

   >UPDATE: Okay I was wrong. Just tested with an example DAG and if you use 
the module name <dagfile>.<class> then it is possible to inject code from the 
DAG directly.
   
   yep. it is
   
   > If the dags folder is added to PYTHONPATH, then yes, but I need to check 
if there is a protection for dag file processor, which processes these files 
and sometimes it's running in the scheduler service.
   
   Nope. There is no protection. Whatever is in PYTHONPATH can be used ... and 
.. airflow will AUTOMATICALLY add `dags` folder to PYTHONPATH too. 
   
   Some more context on that one.
   
   Historically, when dag file processor was not standalone this was even more 
important.
   
   This is also due to historical reasons. DAGFileProcessor in likely 9X* of 
airflow installations is not "standalone". It is a newly forked process, so it 
is not really the "same" context as scheduler - those are different proceses. 
But they share everything else (memory, filesystem and they use the same 
PYTHONPATH - there is only one process to set the original PYTHONPATH to 
(`airflow scheduler`). So that basically means that both `airflow scheduler 
process` as well as `dag file processor subprocess` have to have access to the 
same PYTHONPATH and DAG folder will be on the PYTHONPATH by definition. This is 
the reason why we cannot let scheduler do`import("arbitrary import provided by 
DAG author")`
   
   It's only after `standalone dag procesor` wher we can (and plan to announce 
it more prominently when ready` to actually isolate scheduler from DAG folder 
and user code.  once we have it, it's theorethically possible to run `airflow 
scheduler` without scheduler even SEEING DAG folder. Simply speaking when 
standalone dag file processor is configured, `airflow scheduler` is essentailly 
DAG-less. This is far more secure setup (but almost no-one uses it yet), and it 
will a base for multi-tenancy separation. And then, the `plugin` option is a 
bit less important (but still quite important because DAG author could make 
airflow import ANY Python package. Which we do not want to do because it could 
later allow crossing boundaries between tenants. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to