How does DAG loading isolation work?

Chris Redekop Tue, 22 Mar 2022 16:21:00 -0700

I have a bunch of dags up and running, they're all working fine. Each
exists in its own directory and they're often split up into multiple files
for cleanliness/organization - all files for each dag reside in their
subdirectory and they don't share anything. The way they reference their
files is like so:
    import sys, os
    sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)))
    import my_other_file_in_this_dir


It works nice...even if multiple dags have files with the same names (like,
they each have a "config.py") airflow keeps them all isolated, and they
behave just how you would want and expect them to...Awesome! Now I've
decided I'm going to add some simple dag validation tests as part of my
CICD...so I write up a test much like this:
    def test_no_import_errors():
        dag_bag = DagBag(dag_folder=path_to_my_dags, include_examples=False)
        assert len(dag_bag.import_errors) == 0, "No Import Failures"

...and what the heck I'm getting import errors like "AttributeError: module
'config' has no attribute 'value2'" because all the config.py files are
conflicting - instead of each dag getting the config.py in its own dir,
they're all getting the first config.py that happened to get loaded. So I
take a look through the code, and I can't figure out why it actually works
in airflow - reading the code I would expect the modules to conflict in
airflow just like they do in the test. Now I'm worrying that all my dags
are doing something that is completely unsupported and they're only working
in airflow by some weird fluke. Can anyone offer any insight into how they
actually work in airflow and/or how I could get my tests to work in the
same way? Is this actually a supported/expected/normal thing to do?  I've
uploaded a simple repro of the issue here
https://github.com/repl-chris/airflow-dag-isolation for clarity and/or
playing.  Thanks!

- Chris

How does DAG loading isolation work?

Reply via email to