Hi Ash, Thanks for the suggestion! Great to hear that the scheduling issue should be resolved now.
I still want to look into dag file parsing optimization, as it feels like even after AIP-15 the delay between DAG directory update and the changes being reflected into the DB can be significant in the case of a large amount of DAG files(1500+ in our case). As you suggest, will measure on 2.0 beta first though. Thank you, Oleksandr On Thu, 12 Nov 2020 at 19:47, Ash Berlin-Taylor <[email protected]> wrote: > Hi Oleksandr, > > So, not to short circuit the discussion, but with the HA work I did > (AIP-15) that is available in 2.0.0beta2, the scheduler has been massively > overhauled, and one of the changes was to break the tie between parsing and > scheduling; the scheduler now operates on the serialised version from the > db. > > So dag file parsing time is much less of a limitation. > > But some "max_dag_files_per_parser" setting may help, but I'd see if 2.0 > fixes your performance issues first. > > -Ash > > > On 12 November 2020 17:21:33 GMT, Oleksandr Muliar < > [email protected]> wrote: >> >> Hello, everyone! >> >> I hope this is the right place to ask about this, please redirect me >> otherwise :) >> >> I was looking into how dag files are imported, and noticed that airflow >> creates a whole new process for each file that can potentially contain >> DAGs, and then closes the process after only processing a single file. >> >> It would seem to me that keeping the process around to parse multiple >> files would be much more efficient (keeps sqlalchemy connections around, >> for example). Is there a specific reason this design was selected, and if >> no - is there any interest in changing this? >> >> The initial reason for me to look into this is that DagBag filling time >> seems to be rather slow when we have a significant amount of dag files >> (more than a thousand files) >> >> Regards, >> Oleksandr >> >> >> ------------------------------ >> This email and any files transmitted with it contain confidential >> information and/or privileged or personal advice. This email is intended >> for the addressee(s) stated above only. If you are not the addressee of the >> email please do not copy or forward it or otherwise use it or any part of >> it in any form whatsoever. If you have received this email in error please >> notify the sender and remove the e-mail from your system. Thank you. >> >> This is an email from the company Just Eat Takeaway.com N.V., a public >> limited liability company with corporate seat in Amsterdam, the >> Netherlands, and address at Oosterdoksstraat 80, 1011 DK Amsterdam, >> registered with the Dutch Chamber of Commerce with number 08142836 and >> where the context requires, includes its subsidiaries and associated >> undertakings. >> > -- This email and any files transmitted with it contain confidential information and/or privileged or personal advice. This email is intended for the addressee(s) stated above only. If you are not the addressee of the email please do not copy or forward it or otherwise use it or any part of it in any form whatsoever. If you have received this email in error please notify the sender and remove the e-mail from your system. Thank you. This is an email from the company Just Eat Takeaway.com N.V., a public limited liability company with corporate seat in Amsterdam, the Netherlands, and address at Oosterdoksstraat 80, 1011 DK Amsterdam, registered with the Dutch Chamber of Commerce with number 08142836 and where the context requires, includes its subsidiaries and associated undertakings.
