coufon commented on issue #5594: [AIRFLOW-4924] Loading DAGs asynchronously in Airflow webserver URL: https://github.com/apache/airflow/pull/5594#issuecomment-512600274 Thanks Ash for the review. The comments are helpful. Using stringified DAG does lose information on UI, I would like to iterate more on how to minimize this loss. To your questions: > This limitation needs to be addressed, it would make dags using custom operators apperar wrong in the UI wouldn't it? Yes. I agree we should improve this. We don't have this in the first version because we think UI down is a bigger issue. But we should consider this. My initial idea to support this is to collect a list of module file paths in the DAG collecting subprocess. A module path is added if it detects a non-Airflow module. On the webserver main process, it firstly loads all modules in the list, then unpickled DAGs. This maybe abused if the user defines Operators in each DAG file: it goes back to the case that webserver process has to load all DAGs files. But as long as the user defined Operator in shared modules. It would be a small overhead. How does it sound? > Additionally I would like to see more detail about what the stringified form is - I don't see any explicit tests covering this. The stringified form is that all fields that can not be pickled or unpickled are replaced. Normally there are two cases: (1) Local functions and lambda functions: They can not be pickled: they are replaced by the string of their source code. The source code can be displayed normally on UI, however, a template using the function can not be rendered. UI shows error when user click the template page. (2) Customer defined operators: They can not be unpickled: replace by BaseOperator, they are displayed as BaseOperator on UI. (3) Other customer defined modules used as DAG or task fields: They can not be unpickled: they are replaced by a class string, e.g., '<__main__.A object at 0x7f4121780828>'
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
