[ https://issues.apache.org/jira/browse/AIRFLOW-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882167#comment-16882167 ]
ASF subversion and git services commented on AIRFLOW-4478: ---------------------------------------------------------- Commit e652b206bbc365cde56b0924417cfa3138e9a205 in airflow's branch refs/heads/v1-10-test from Joshua Carp [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=e652b20 ] [AIRFLOW-4478] Lazily instantiate default resources objects. (#5259) Instantiating `Resources` and its child classes takes non-negligible time when users create many operators. To save time, don't create the resources object until it is needed. (cherry picked from commit 526c65a57204022596fb69e9478c5515ad0b880e) > Operators instantiate many duplicate objects > -------------------------------------------- > > Key: AIRFLOW-4478 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4478 > Project: Apache Airflow > Issue Type: Improvement > Reporter: Josh Carp > Assignee: Huihua Zhang > Priority: Trivial > Fix For: 1.10.4 > > > `BaseOperator` creates a `Resources` instance, which in turn creates four > `Resource` instances. Class creation in python isn't free; creating > `Resources` and its child classes takes ~5μs out of a total of ~20μs to > instantiate a `BaseOperator` on my system. This time adds up when creating > tens of thousands of operators, especially in environments like GCP Cloud > Composer that are very sensitive to DAG parse time. > Assuming that most users don't actually configure task resources, since > they're only respected by the non-default `CgroupTaskRunner`, we can save > time by creating a single `Resources` instance and sharing it across tasks > that don't set `resources`. We could do even better by allowing users to pass > a `Resources` instance to `BaseOperator` rather than passing a `dict` that's > used to instantiate `Resources`, but that would be a breaking change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)