On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <[email protected]> wrote:
>
> Hopefully last set of questions for now...

Just wait, the rabbit hole gets deeper :)

>
> 1) It sounds like there is a risk that as the ASF grows, GH may not be able 
> to grow with us.  Did I understand that correctly?

GH CI may not be willing to continue giving us free usage. The current
free usage we have is limited, but they are willing to augment - to
what degree we aren't sure yet. We're talking with Github.
Github the VCS will always be free (at least for all versions of the
future that I can foresee short of Github being shuttered)


> 2) If we have money to offer GH, why can't we offer money to the CI Vendors 
> so we aren't really abusing their free tiers?

We currently pay one CI vendor (Travis - the only one aside from GH
that doesn't need write access. We pay them 12k a year, and are
planning on increasing that spend in next years budget.
We've discussed paying or getting cloud credits from both Azure and
AWS - but ran into the write access problem.
We're currently discussing with GH getting credits or paying them for
more Github Actions capacity.

> 3) Does GH track my activity in the ASF GH repos as part of the API usage for 
> Apache?  IOW, am I adding to the ASF API count by closing an issue on 
> github.com?  Or if I ran a script on my computer that closed the issue by 
> using their API?

No, it's tied to our user/IP address. Your actions likely won't come
close to our complex usage.
>
> I think builds.a.o is a great free service, but AIUI, the 
> no-third-party-write-access rule is independent of whether CI is free or not. 
>  I cannot pay money and get write-access to the ASF repos.  So I think I'm 
> trying to see if there is a solution even if it did cost money.
>

I should have been more explicit - we aren't opposed to spending money
on this, and do already spend some money. I'm worried that there is no
limit to the money that could be spent - particularly when people
don't have good insight into what their builds might cost the
Foundation. So for instance, there was a project at the ASF that
consumed 900 dollars/month of our 1000/month spend with Travis. They
didn't realize that they were consuming so much. They also didn't
realize that other projects were feeling the pain - they had optimized
their CI builds to execute really fast in Travis - essentially
concurrently consuming every builder. But the reality is that some
projects need more resources than others and allocating resources
appropriately becomes quite the challenge.

> Thanks in advance,
> -Alex
>
> On 2/3/20, 7:03 PM, "David Nalley" <[email protected]> wrote:
>
>     On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <[email protected]> 
> wrote:
>     >
>     > Some questions inline.  Apologies in advance for not really 
> understanding this stuff.  I'm primarily a client-side developer.  My 
> projects do not have automated PR testing at this point in time.  I'm mainly 
> exploring in case we become popular enough some day to need it.
>     >
>     > My line of thinking is that MS has, at least for now, generously 
> provided free Azure VMs to ASF committers.  If N committers from a project 
> each get a VM, run CI on it, figure out some way to distribute PRs to those 
> VMs, is there a viable workflow?
>     >
>     > On 2/3/20, 6:38 PM, "David Nalley" <[email protected]> wrote:
>     >
>     >     Hi Alex,
>     >
>     >     So this was explored. It creates some problems - first double the
>     >     administration overhead - most of that is automated, but it means 
> that
>     >     our API usage doubles, and we're already hitting limits from Github.
>     >
>     > Is that a max-traffic limit or a limit on traffic before we have to 
> start paying for usage?
>
>     Max number of calls - and we've tried offering up money, they don't
>     offer a product with more API calls. Greg has even raised this issue
>     all the way to the CEO of Github.
>
>     >
>     >     Second - at least one CI vendor thanked us for not doing that 
> exactly
>     >     - because the 'best' way to do it is to create an org per project or
>     >     org per repo - and then the free tier is dedicated to that org. 
> Except
>     >     that's essentially abusing their free tier.
>     >
>     > Is "best" defined as lowest cost to the CI vendor or something else?  
> What would the "second-best" scenario look like if there is one?
>
>     Best - well it's the cheapest for us, and it gives the most control to
>     the projects. So great from that perspective, but likely a bit
>     unethical and abusive. It's essentially abusing all of the CI vendors
>     generosity by horizontally scaling our consumption of their freebies
>     and using them per-repo or per project instead of per organization.
>
>
>     >
>     >     Finally - from a practical perspective, if everyone submits PRs and
>     >     does testing against this apacheci org - that has become the de 
> facto
>     >     repo - it's where everyone is doing their work, and it makes
>     >     provenance tracking.
>     >
>     > Didn't the ASF have read-only mirrors of repos?  I think it led to some 
> confusion, but I think folks still figured out.
>     >
>
>     Not anymore.
>     We have an active-active copy of the repositories. People can actively
>     commit against either our repos or the GH repos, and we magically move
>     commits between the two. (There's an upcoming blog post on how all of
>     this magic works)
>
>     >     As an aside - the mandate for no write access is not an 
> infrastructure
>     >     policy, it's a legal affairs requirement - we're merely implementing
>     >     it.
>     >
>     >     --David
>     >
>     >     On Tue, Feb 4, 2020 at 3:24 AM Alex Harui 
> <[email protected]> wrote:
>     >     >
>     >     > Moving board@ to BCC.  Attempting to move discussion to builds@
>     >     >
>     >     > I’m fine with the ASF maintaining its position on stricter 
> provenance and therefore disallowing third-party write-access to repos.
>     >     >
>     >     > A suggestion was made, if I understood it correctly, to create a 
> whole other set of repos that could be written to by third-parties.  Would 
> such a thing work?  Then a committer would have to manually bring commits 
> back from that other set to the canonical repo.  That seems viable to me.
>     >     >
>     >     > A concern was raised that the project might cut its release from 
> the “other set”, but IMO, that would be ok if the release artifacts could be 
> verified, which should be possible by comparing the canonical repo against 
> the “other repo”, at least for the source package, and if there are 
> reproducible binaries, for the binary artifacts as well.
>     >     >
>     >     > Thoughts?
>     >     > -Alex
>     >     >
>     >     > From: Greg Stein <[email protected]>
>     >     > Reply-To: "[email protected]" <[email protected]>
>     >     > Date: Monday, February 3, 2020 at 5:17 PM
>     >     > To: "[email protected]" <[email protected]>
>     >     > Subject: Re: [CI] What are the troubles projects face with CI and 
> Infra
>     >     >
>     >     > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui 
> <[email protected]<mailto:[email protected]>> wrote:
>     >     > >...
>     >     > How does Google or other non-ASF open source projects manage the 
> provenance tracking?
>     >     >
>     >     > Note that most F/OSS projects don't worry about provenance to the 
> level the Foundation worries. That affords them some flexibility that our 
> choices do not allow. Those projects may also choose to trust tools with 
> write access to their repositories, hoping they will not Do Something 
> Bad(tm). We have chosen to not provide that trust.
>     >     >
>     >     > IMO, I do not think the Foundation should relax its stance on 
> provenance, nor trust in third parties ... but that is one of the key 
> considerations [for the Board] at the heart of being able to leverage some 
> third party CI/CD services.
>     >     >
>     >     > Cheers,
>     >     > -g
>     >     >
>     >
>     >
>
>

Reply via email to