[ 
https://issues.apache.org/jira/browse/AIRFLOW-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878677#comment-16878677
 ] 

ASF subversion and git services commented on AIRFLOW-4478:
----------------------------------------------------------

Commit 526c65a57204022596fb69e9478c5515ad0b880e in airflow's branch 
refs/heads/master from Joshua Carp
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=526c65a ]

[AIRFLOW-4478] Lazily instantiate default resources objects. (#5259)

Instantiating `Resources` and its child classes takes non-negligible
time when users create many operators. To save time, don't create the 
resources object until it is needed.

> Operators instantiate many duplicate objects
> --------------------------------------------
>
>                 Key: AIRFLOW-4478
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4478
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Josh Carp
>            Assignee: Huihua Zhang
>            Priority: Trivial
>
> `BaseOperator` creates a `Resources` instance, which in turn creates four 
> `Resource` instances. Class creation in python isn't free; creating 
> `Resources` and its child classes takes ~5μs out of a total of ~20μs to 
> instantiate a `BaseOperator` on my system. This time adds up when creating 
> tens of thousands of operators, especially in environments like GCP Cloud 
> Composer that are very sensitive to DAG parse time.
> Assuming that most users don't actually configure task resources, since 
> they're only respected by the non-default `CgroupTaskRunner`, we can save 
> time by creating a single `Resources` instance and sharing it across tasks 
> that don't set `resources`. We could do even better by allowing users to pass 
> a `Resources` instance to `BaseOperator` rather than passing a `dict` that's 
> used to instantiate `Resources`, but that would be a breaking change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to