ashb commented on a change in pull request #6792: [AIRFLOW-5930] Use cached-SQL query building for hot-path queries URL: https://github.com/apache/airflow/pull/6792#discussion_r367127376
########## File path: airflow/jobs/scheduler_job.py ########## @@ -1282,21 +1253,15 @@ def _change_state_for_executable_task_instances(self, task_instances, task_instance.queued_dttm = timezone.utcnow() session.merge(task_instance) - # Generate a list of SimpleTaskInstance for the use of queuing - # them in the executor. - simple_task_instances = [SimpleTaskInstance(ti) for ti in - tis_to_set_to_queued] - task_instance_str = "\n\t".join( [repr(x) for x in tis_to_set_to_queued]) session.commit() self.log.info("Setting the following %s tasks to queued state:\n\t%s", len(tis_to_set_to_queued), task_instance_str) - return simple_task_instances + return tis_to_set_to_queued - def _enqueue_task_instances_with_queued_state(self, simple_dag_bag, - simple_task_instances): + def _enqueue_task_instances_with_queued_state(self, simple_dag_bag, task_instances): Review comment: This is called with the result of `self._change_state_for_executable_task_instances` which used to return this ``` simple_task_instances = [SimpleTaskInstance(ti) for ti in tis_to_set_to_queued] ``` But one of the slow downs I noticed was continually re-looking up the TI, so it has been changed to pass full TaskInstance objects around (as we are all in the same process). I'll update the docs to reflect the new code/type ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services