Re: [DISCUSS] Consider disabling self-hosted runners for commiter PRs

Aritra Basu Fri, 05 Apr 2024 04:47:57 -0700

I'm +0. Definitely don't see any issue with seeing the changes.

--
Regards,
Aritra Basu


On Fri, Apr 5, 2024, 3:37 PM Amogh Desai <amoghdesai....@gmail.com> wrote:

> +1 I like the idea.
> Looking forward to seeing the difference.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Fri, Apr 5, 2024 at 3:54 AM Ferruzzi, Dennis
> <ferru...@amazon.com.invalid>
> wrote:
>
> > Interested in seeing the difference, +1
> >
> >
> >  - ferruzzi
> >
> >
> > ________________________________
> > From: Oliveira, Niko <oniko...@amazon.com.INVALID>
> > Sent: Thursday, April 4, 2024 2:00 PM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Consider disabling
> > self-hosted runners for commiter PRs
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > +1I'd love to see this as well.
> >
> > In the past, stability and long queue times of PR builds have been very
> > frustrating. I'm not 100% sure this is due to using self hosted runners,
> > since 35 queue depth (to my mind) should be plenty. But something about
> > that setup has never seemed quite right to me with queuing. Switching to
> > public runners for a while to experiment would be great to see if it
> > improves.
> >
> > ________________________________
> > From: Pankaj Koti <pankaj.k...@astronomer.io.INVALID>
> > Sent: Thursday, April 4, 2024 12:41:02 PM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Consider disabling
> > self-hosted runners for commiter PRs
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > +1 from me to this idea.
> >
> > Sounds very reasonable to me.
> > At times, my experience has been better with public runners instead of
> > self-hosted runners :)
> >
> > And like already mentioned in the discussion, I think having the ability
> of
> > a applying the label "use-self-hosted-runners" to be used for critical
> > times would be nice to have too.
> >
> >
> > On Fri, 5 Apr 2024, 00:50 Jarek Potiuk, <ja...@potiuk.com> wrote:
> >
> > > Hello everyone,
> > >
> > > TL;DR With some recent changes in GitHub Actions and the fact that ASF
> > has
> > > a lot of runners available donated for all the builds, I think we could
> > > experiment with disabling "self-hosted" runners for committer builds.
> > >
> > > The self-hosted runners of ours have been extremely helpful (and we
> > should
> > > again thank Amazon and Astronomer for donating credits / money for
> > those) -
> > > when the Github Public runners have been far less powerful - and we had
> > > less number of those available for ASF projects. This saved us a LOT of
> > > troubles where there was a contention between ASF projects.
> > >
> > > But as of recently both limitations have been largely removed:
> > >
> > > * ASF has 900 public runners donated by GitHub to all projects
> > > * Those public runners have (as of January) for open-source projects
> now
> > > have 4 CPUS and 16GB of memory -
> > >
> > >
> >
> https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/
> > >
> > >
> > > While they are not as powerful as our self-hosted runners, the
> > parallelism
> > > we utilise for those brings those builds in not-that bad shape compared
> > to
> > > self-hosted runners. Typical differences between the public and
> > self-hosted
> > > runners now for the complete set of tests are ~ 20m for public runners
> > and
> > > ~14 m for self-hosted ones.
> > >
> > > But this is not the only factor - I think committers experience the
> "Job
> > > failed" for self-hosted runners generally much more often than
> > > non-committers (stability of our solution is not best, also we are
> using
> > > cheaper spot instances). Plus - we limit the total number of
> self-hosted
> > > runners (35) - so if several committers submit a few PRs and we have
> > canary
> > > build running, the jobs will wait until runners are available.
> > >
> > > And of course it costs the credits/money of sponsors which we could use
> > for
> > > other things.
> > >
> > > I have - as of recently - access to Github Actions metrics - and while
> > ASF
> > > is keeping an eye and stared limiting the number of parallel jobs
> > workflows
> > > in projects are run, it looks like even if all committer runs are added
> > to
> > > the public runners, we will still cause far lower usage that the limits
> > are
> > > and far lower than some other projects (which I will not name here).  I
> > > have access to the metrics so I can monitor our usage and react.
> > >
> > > I think possibly - if we switch committers to "public" runners by
> default
> > > -the experience will not be much worse for them (and sometimes even
> > better
> > > - because of stability/limited queue).
> > >
> > > I was planning this carefully - I made a number of refactors/changes to
> > our
> > > workflows recently that makes it way easier to manipulate the
> > configuration
> > > and get various conditions applied to various jobs - so
> > > changing/experimenting with those settings should be - well - a breeze
> > :).
> > > Few recent changes had proven that this change and workflow refactor
> were
> > > definitely worth the effort, I feel like I finally got a control over
> it
> > > where previously it was a bit like herding a pack of cats (which I
> > > brought to live by myself, but that's another story).
> > >
> > > I would like to propose to run an experiment and see how it works if we
> > > switch committer PRs back to the public runners - leaving the
> self-hosted
> > > runners only for canary builds (which makes perfect sense because those
> > > builds run a full set of tests and we need as much speed and power
> there
> > as
> > > we can.
> > >
> > > This is pretty safe, We should be able to switch back very easily if we
> > see
> > > problems. I will also monitor it and see if our usage is within the
> > limits
> > > of the ASF. I can also add the feature that committers should be able
> to
> > > use self-hosted runners by applying the "use self-hosted runners" label
> > to
> > > a PR.
> > >
> > > Running it for 2-3 weeks should be enough to gather experience from
> > > committers - whether things will seem better or worse for them - or
> maybe
> > > they won't really notice a big difference.
> > >
> > > Later we could consider some next steps - disabling the self-hosted
> > runners
> > > for canary builds if we see that our usage is low and build are fast
> > > enough, eventually possibly removing current self-hosted runners and
> > > switching to a better k8s based infrastructure (which we are close to
> do
> > > but it makes it a bit difficult while current self-hosted solution is
> so
> > > critical to keep it running (like rebuilding the plane while it is
> > flying).
> > > I'd love to do it gradually in the "change slowly and observe" mode -
> > > especially now that I have access to "proper" metrics.
> > >
> > > WDYT?
> > >
> > > J.
> > >
> >
>

Re: [DISCUSS] Consider disabling self-hosted runners for commiter PRs

Reply via email to