On Fri, May 11, 2018 at 4:06 PM, Gregory Szorc <g...@mozilla.com> wrote:
> On Wed, May 9, 2018 at 11:01 AM, Ted Mielczarek <t...@mielczarek.org> > wrote: > > > On Wed, May 9, 2018, at 1:11 PM, L. David Baron wrote: > > > > mozregression won't be able to bisect into inbound branches then, > but I > > > > believe we've always been expiring build artifacts created from > > integration > > > > branches after a few months in any case. > > > > > > > > My impression was that people use mozregression primarily for > tracking > > down > > > > relatively recent regressions. Please correct me if I'm wrong. > > > > > > It's useful for tracking down regressions no matter how old the > > > regression is; I pretty regularly see mozregression finding useful > > > data on bugs that regressed multiple years ago. > > > > To be clear here--we still have an archive of nightly builds dating back > > to 2004, so you should be able to bisect to a single day using that. We > > haven't ever had a great policy for retaining individual CI builds like > > these tinderbox-builds. They're definitely useful, and storage is not > that > > expensive, but given the number of build configurations we produce > nowadays > > and the volume of changes being pushed we can't archive everything > forever. > > > It's worth noting that once builds are deterministic, a build system is > effectively a highly advanced caching mechanism. It follows that cache > eviction is therefore a tolerable problem: if the entry isn't in the cache, > you just build again! Artifact retention and expiration boils down to a > trade-off between the cost of storage and the convenience of accessing > something immediately (as opposed to waiting several dozen minutes to > populate the cache). > > The good news is that Linux Firefox builds have been effectively > deterministic (modulo PGO and some minor build details like the build time) > for several months now (thanks, glandium!). And moving to Clang on all > platforms will make it easier to achieve deterministic builds on other > platforms. The bad news is we still have many areas of CI that are not > hermetic and attempts to retrigger Firefox build tasks in the future have a > very high possibility of failing for numerous reasons (e.g. some dependent > task of the build hits a 3rd party server that is no longer available or > has deleted a file). In other words, our CI results may not be reproducible > in the future. So if we delete an artifact, even though the build is > deterministic, we may not have all the inputs to reconstruct that result. > > Making CI hermetic and reproducible far in the future is a hard problem. > There are esoteric failure scenarios like "what if we need to fetch content > from a server in 2030 but TLS 1.2 has been disabled due to a critical > vulnerability and code in the hermetic build task doesn't support TLS 1.3." > In order to realistically achieve reproducible builds in the future, we > need to store *all* inputs somewhere reliable where they will always be > available. Version control is one possibility. A content-indexed service > like tooltool is another. (At Google, they check in the source code for > Clang, glibc, binutils, Linux, etc into version control so all they need is > a version revision and a bootstrap compiler (which I also suspect they > check into the monorepo) to rebuild the world from source.) > > What I'm trying to say is we're making strides towards making builds > deterministic and reproducible far in the future. So hopefully in a few > years we won't need to be concerned about deleting old data because our > answer will be "we can easily reproduce it at any time." > This might end up being true, but it seems a bit optimistic to me. I've worked with lots of systems much simpler than our builds that were in theory reproducible but then found when I went back to reproduce the results, things weren't so simple. You allude to one case above: it's one thing to have reproducible builds from days ago and quite another from years ago. Given the incredibly low cost of storage (the street price of Glacier is $.004/GB/month) [0] I'd be pretty hesitant to delete data which we thought we might want to use again just because we figured we'd reproduce it. -Ekr [0] https://aws.amazon.com/glacier/ > _______________________________________________ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform