Cedric Hourcade created AIRFLOW-1729:
----------------------------------------

             Summary: Ignore whole directories in .airflowignore
                 Key: AIRFLOW-1729
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1729
             Project: Apache Airflow
          Issue Type: Improvement
          Components: core
    Affects Versions: Airflow 2.0
            Reporter: Cedric Hourcade
            Priority: Minor


The .airflowignore file allows to prevent scanning files for DAG. But even if 
we blacklist fulldirectory the {{os.walk}} will still go through them no matter 
how deep they are and skip files one by one, which can be an issue when you 
keep around big .git or virtualvenv directories.

I suggest to add something like:
{code}
dirs[:] = [d for d in dirs if not any([re.findall(p, os.path.join(root, d)) for 
p in patterns])]
{code}
to prune the directories here: 
https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/utils/dag_processing.py#L208-L209
 and in {{list_py_file_paths}}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to