On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <[email protected]>
wrote:
Hopefully last set of questions for now...
Just wait, the rabbit hole gets deeper :)
1) It sounds like there is a risk that as the ASF grows, GH may not be
able to grow with us. Did I understand that correctly?
GH CI may not be willing to continue giving us free usage. The current
free usage we have is limited, but they are willing to augment - to
what degree we aren't sure yet. We're talking with Github.
Github the VCS will always be free (at least for all versions of the
future that I can foresee short of Github being shuttered)
2) If we have money to offer GH, why can't we offer money to the CI
Vendors so we aren't really abusing their free tiers?
We currently pay one CI vendor (Travis - the only one aside from GH
that doesn't need write access. We pay them 12k a year, and are
planning on increasing that spend in next years budget.
We've discussed paying or getting cloud credits from both Azure and
AWS - but ran into the write access problem.
We're currently discussing with GH getting credits or paying them for
more Github Actions capacity.
3) Does GH track my activity in the ASF GH repos as part of the API
usage for Apache? IOW, am I adding to the ASF API count by closing an
issue on github.com? Or if I ran a script on my computer that closed the
issue by using their API?
No, it's tied to our user/IP address. Your actions likely won't come
close to our complex usage.
I think builds.a.o is a great free service, but AIUI, the
no-third-party-write-access rule is independent of whether CI is free or
not. I cannot pay money and get write-access to the ASF repos. So I think
I'm trying to see if there is a solution even if it did cost money.
I should have been more explicit - we aren't opposed to spending money
on this, and do already spend some money. I'm worried that there is no
limit to the money that could be spent - particularly when people
don't have good insight into what their builds might cost the
Foundation. So for instance, there was a project at the ASF that
consumed 900 dollars/month of our 1000/month spend with Travis. They
didn't realize that they were consuming so much. They also didn't
realize that other projects were feeling the pain - they had optimized
their CI builds to execute really fast in Travis - essentially
concurrently consuming every builder. But the reality is that some
projects need more resources than others and allocating resources
appropriately becomes quite the challenge.
Thanks in advance,
-Alex
On 2/3/20, 7:03 PM, "David Nalley" <[email protected]> wrote:
On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <[email protected]>
wrote:
>
> Some questions inline. Apologies in advance for not really
understanding this stuff. I'm primarily a client-side developer. My
projects do not have automated PR testing at this point in time. I'm
mainly exploring in case we become popular enough some day to need it.
>
> My line of thinking is that MS has, at least for now, generously
provided free Azure VMs to ASF committers. If N committers from a project
each get a VM, run CI on it, figure out some way to distribute PRs to those
VMs, is there a viable workflow?
>
> On 2/3/20, 6:38 PM, "David Nalley" <[email protected]> wrote:
>
> Hi Alex,
>
> So this was explored. It creates some problems - first double
the
> administration overhead - most of that is automated, but it
means that
> our API usage doubles, and we're already hitting limits from
Github.
>
> Is that a max-traffic limit or a limit on traffic before we have
to start paying for usage?
Max number of calls - and we've tried offering up money, they don't
offer a product with more API calls. Greg has even raised this issue
all the way to the CEO of Github.
>
> Second - at least one CI vendor thanked us for not doing that
exactly
> - because the 'best' way to do it is to create an org per
project or
> org per repo - and then the free tier is dedicated to that
org. Except
> that's essentially abusing their free tier.
>
> Is "best" defined as lowest cost to the CI vendor or something
else? What would the "second-best" scenario look like if there is one?
Best - well it's the cheapest for us, and it gives the most control
to
the projects. So great from that perspective, but likely a bit
unethical and abusive. It's essentially abusing all of the CI vendors
generosity by horizontally scaling our consumption of their freebies
and using them per-repo or per project instead of per organization.
>
> Finally - from a practical perspective, if everyone submits
PRs and
> does testing against this apacheci org - that has become the
de facto
> repo - it's where everyone is doing their work, and it makes
> provenance tracking.
>
> Didn't the ASF have read-only mirrors of repos? I think it led to
some confusion, but I think folks still figured out.
>
Not anymore.
We have an active-active copy of the repositories. People can
actively
commit against either our repos or the GH repos, and we magically
move
commits between the two. (There's an upcoming blog post on how all of
this magic works)
> As an aside - the mandate for no write access is not an
infrastructure
> policy, it's a legal affairs requirement - we're merely
implementing
> it.
>
> --David
>
> On Tue, Feb 4, 2020 at 3:24 AM Alex Harui
<[email protected]> wrote:
> >
> > Moving board@ to BCC. Attempting to move discussion to
builds@
> >
> > I’m fine with the ASF maintaining its position on stricter
provenance and therefore disallowing third-party write-access to repos.
> >
> > A suggestion was made, if I understood it correctly, to
create a whole other set of repos that could be written to by
third-parties. Would such a thing work? Then a committer would have to
manually bring commits back from that other set to the canonical repo.
That seems viable to me.
> >
> > A concern was raised that the project might cut its release
from the “other set”, but IMO, that would be ok if the release artifacts
could be verified, which should be possible by comparing the canonical repo
against the “other repo”, at least for the source package, and if there are
reproducible binaries, for the binary artifacts as well.
> >
> > Thoughts?
> > -Alex
> >
> > From: Greg Stein <[email protected]>
> > Reply-To: "[email protected]" <[email protected]>
> > Date: Monday, February 3, 2020 at 5:17 PM
> > To: "[email protected]" <[email protected]>
> > Subject: Re: [CI] What are the troubles projects face with
CI and Infra
> >
> > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <[email protected]
<mailto:[email protected]>> wrote:
> > >...
> > How does Google or other non-ASF open source projects manage
the provenance tracking?
> >
> > Note that most F/OSS projects don't worry about provenance
to the level the Foundation worries. That affords them some flexibility
that our choices do not allow. Those projects may also choose to trust
tools with write access to their repositories, hoping they will not Do
Something Bad(tm). We have chosen to not provide that trust.
> >
> > IMO, I do not think the Foundation should relax its stance
on provenance, nor trust in third parties ... but that is one of the key
considerations [for the Board] at the heart of being able to leverage some
third party CI/CD services.
> >
> > Cheers,
> > -g
> >
>
>