eladkal commented on code in PR #50371: URL: https://github.com/apache/airflow/pull/50371#discussion_r2102327257
########## airflow-core/src/airflow/dag_processing/processor.py: ########## @@ -94,8 +96,35 @@ def _parse_file_entrypoint(): comms_decoder.send_request(log, result) +def _pre_import_airflow_modules(file_path: str, log: FilteringBoundLogger) -> None: + """ + Pre-import Airflow modules found in the given file. + + This prevents modules from being re-imported in each processing process, + saving CPU time and memory. + + Args: + file_path: Path to the file to scan for imports + log: Logger instance to use for warnings + + parsing_pre_import_modules: + default value is True + """ + if not conf.getboolean("scheduler", "parsing_pre_import_modules", fallback=True): + return + + for module in iter_airflow_imports(file_path): + try: + importlib.import_module(module) + except ModuleNotFoundError as e: + log.warning("Error when trying to pre-import module '%s' found in %s: %s", module, file_path, e) Review Comment: Correct me if I am wrong but if import fails it means that the dag proccessor will never try to re-import it till the process is restarted. I think we need to change this. I think this also explains why in some cases (in 2.x) after dag processor restart I see many dags as broken always with import errors on my own modules. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org