Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-12-20 Thread via GitHub
github-actions[bot] closed pull request #35243: Don't get DAG out of DagBag when we already have it URL: https://github.com/apache/airflow/pull/35243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-12-14 Thread via GitHub
github-actions[bot] commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1857050406 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for you

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-30 Thread via GitHub
jens-scheffler-bosch commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1785374317 While sitting in the train failing to build the Airflow container via breeze I was re-inspecting the code. I believe I now saw the root cause for the performance problem we

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-30 Thread via GitHub
AutomationDev85 commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1785061958 During the runtime measurement a few month ago I run into issue that these lines consumed a lot of time. When we schedule DAG with many DAG runs. I was not aware that there is s

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-30 Thread via GitHub
ashb commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1784926965 The difference between an LRU cache and the cache in dagbag is that the later does a `datetime.now()` call (more or less). Additionally the change here to `dag = dag_run.dag or self

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-28 Thread via GitHub
jens-scheffler-bosch commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1783937825 Yes, in deep I was also scratching my head. Obviously there is a kind of basic caching but also with expiry check. The main driver for the `lru_cache` was the use in `_get_

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-28 Thread via GitHub
ashb commented on PR #35243: URL: https://github.com/apache/airflow/pull/35243#issuecomment-1783904570 >> Because the time on our slow DB to query the dag took between 50ms and 250ms and if you execute this only once or 60 times during one scheduler loop run this makes a big change.

Re: [PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-28 Thread via GitHub
ashb commented on code in PR #35243: URL: https://github.com/apache/airflow/pull/35243#discussion_r1375310439 ## airflow/jobs/scheduler_job_runner.py: ## @@ -1064,12 +1063,8 @@ def _do_scheduling(self, session: Session) -> int: callback_tuples = self._schedule_all_d

[PR] Don't get DAG out of DagBag when we already have it [airflow]

2023-10-28 Thread via GitHub
ashb opened a new pull request, #35243: URL: https://github.com/apache/airflow/pull/35243 Two things here: 1. By the ponit we are looking at the "callbacks" `dagrun.dag` will already be set, (the `or dagbag.get_dag` is a safety precaution. It might not be required or worth it) 2.