potiuk commented on PR #35097:
URL: https://github.com/apache/airflow/pull/35097#issuecomment-1773764486

   One thing worth mentioning that even 0.3 s is pretty impactful. All our DAGs 
are parsed in DAG file process in a separately forked processes - so if such 
'pandas' import happens at top level of all DAG files, by default the 0.3 s is 
an overhead for every dag every 30s - or min-parsing interval. 
   
   We already deal with some of that 
https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#parsing-pre-import-modules
 (but only for airflow imports, not for other expensive imports). Also I think 
it's worth keeping import for another case, explicitly in. I had seen quite a 
few examples where "organisation-level" imports were doing a lot of things - 
and it's been hidden from DAG developer (for example I saw pulling 
configuration from via expensive API call happening there). 
   
   People often don't realize that just running `import something` might be 
very expensive, so teaching people that it might happen (and that it heavily 
impacts DAG parsin) is an important function of that page.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to