Thanks a lot, Ash! This would absolutely help a lot.
XD On Tue, Feb 9, 2021 at 5:17 PM Daniel Imberman <[email protected]> wrote: > Thank you for your work on this Ash! > > One thing to mention is that while this only directly affects the > committer/PMC runs, it should still free up more resources overall. > > Might also be worth bringing this up to the ASF board as perhaps other > projects can consider similar methods. > > On Tue, Feb 9, 2021 at 6:16 AM, Kaxil Naik <[email protected]> wrote: > > Great work on this, I know much time you dedicated on it. > > Regards, > Kaxil > > On Tue, Feb 9, 2021 at 1:40 PM Ash Berlin-Taylor <[email protected]> wrote: > >> Hi everyone. >> >> After a good two weeks of playing whack-a-mole with bugs, I have finally >> merged https://github.com/apache/airflow/pull/13730 which means that >> *some* builds now run on machines under our control. >> >> The biggest difference this will make is that 1) we won't be stuck in a >> queue behind other ASF projects waiting for our "slot", 2) builds should >> also be a bit faster now due to running most of the build on tmpfs >> >> I will do a more in-depth write up soon, but the rough architecture is: >> >> - A GitHub application receives events and whenever* a check-run is >> created that posts to: >> - A AWS Lambda function (via API gateway) that check if there is an idle >> runner already >> - an ASG that configures r5a.xlarge instances with tmpfs in "interesting" >> places (docker store, tmp dirs etc) >> - Some clever processes on the instance that set/clear ScaleInProtection >> so that running jobs don't get killed, and emits a custom CloudWatch metric) >> - A CloudWatch alarm to scale down the ASG when nodes are idle >> - A paid-for docker hub user on these machines to avoid hitting pull >> limits. >> >> The major downside is that due to security concerns, builds for non >> committers/PMC members still run on the public queue. However the "build >> image" step for everyone now runs on our machines, so everyone should >> benefit a bit. >> >> I do expect a bit of fallout from this, so I will be monitoring the >> Actions queue, but if there are any problems or issues let me know (here, >> or on Slack) >> >> -ash >> >
