[ 
https://issues.apache.org/jira/browse/AIRFLOW-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547113#comment-16547113
 ] 

Kevin Yang commented on AIRFLOW-2762:
-------------------------------------

[~ashb] Ty for the opinions. I think that is good idea, since it will also 
provide some sort of consistency between scheduler and webserver. Though to be 
able to do that, we need to store more info in the DagModel that webserver 
needs, e.g. the dependency. I am also not very sure about how much extra load 
that would place on the DB. I think if we go this route, we might want to build 
a DAG parsing component that parses DAG for both scheduler and webserver. I 
think before we decided to do that, we can try parallelize the parsing on 
webserver--the work can be reused when we have the DAG parsing service since 
the webserver will be using the serializable info of the DAG instead of the the 
DAG object in both cases. 

> Parallelize DAG parsing in webserver
> ------------------------------------
>
>                 Key: AIRFLOW-2762
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2762
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Kevin Yang
>            Priority: Major
>
> Currently the webserver parses DagBag in a single thread fashion and causes 
> the start up time to be slow when we have large # of DAG files. Webservers 
> should not need the actual DAG object and this should be parallelized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to