coufon removed a comment on issue #5594: [AIRFLOW-4924] Loading DAGs asynchronously in Airflow webserver URL: https://github.com/apache/airflow/pull/5594#issuecomment-511892627 > I like the idea. As Ash Berlin-Taylor mentioned in JIRA - it likely won't be needed in this form when we implement persisting the DAGs /stateless webserver. However it sounds it can be nice as an intermediate solution (and rather smallish incremental change as opposed to big structural change in Airflow itself) until we got all details worked out for those. And can be even cherry-pickable if we ever attempt to release 1.10.5. > > Sounds like the idea of stringifying the DAGs is interesting and might be used as a starting point to implement part of DAG persistence (no matter how it will be implemented eventually). It's only for DAG Python code text and 'structure' of course - there is no way this can be used to actually execute the DAGs, but it serves well the purpose that you want fast loading of many DAGs as Python objects in cases where you have to have many DAG objects in the same process (UI/scheduler). But I like the idea to have "stringified" and "real" version of DAGs - one for structure/code and one for actual execution. Sounds like an interesting optimisation which is pretty independent from any other Airflow features. > > And maybe we can use the benefit that this solution is available in Composer already (as alpha) and can be tested on a wide variety of configurations (especially that it is aimed for deployments with big number of DAGs and I assume customers will only enable it when they have big number of DAGs with complex structure). It's very valuable for Airflow community to get code that have been battle-tested already. Zhou Fang -> maybe you can share your experiences with an actual "production" usage of this? > > What I do not fully understand yet about the current implementation is casting to BaseOperator for non-airflow modules. Zhou Fang - can you maybe explain a bit why this is needed ? > > I understand that the stringified Dags must be picklable to send over multiprocessing Queue and then for the UI to create the objects and be able to show the structure and code. Same in scheduler - we only want to use the Dag code to schedule it. Maybe I am missing something - but I am not sure why this BaseOperator casting is needed in this case as long as all the custom classes are also loaded in webserver/scheduler. Thanks Jarek for the comments.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
