Let me know if I'm talking nonsense, but I believe that we are
building both stages for each architecture and flavour. Do we need to
build two stages everywhere? What stops us from building a single
stage? And if anything, what can we change to get into a situation
where we can?
Quite better than reusing build incrementally, is not building at all.
On Mon, Feb 22, 2021 at 10:09 AM Simon Peyton Jones via ghc-devs
<ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>> wrote:
Incremental CI can cut multiple hours to < mere minutes,
especially with the test suite being embarrassingly parallel.
There simply no way optimizations to the compiler independent from
sharing a cache between CI runs can get anywhere close to that
return on investment.
I rather agree with this. I don’t think there is much low-hanging
fruit on compile times, aside from coercion-zapping which we are
working on anyway. If we got a 10% reduction in compile time we’d
be over the moon, but our users would barely notice.
To get truly substantial improvements (a factor of 2 or 10) I
think we need to do less compiling – hence incremental CI.
Simon
*From:*ghc-devs <ghc-devs-boun...@haskell.org
<mailto:ghc-devs-boun...@haskell.org>> *On Behalf Of *John Ericson
*Sent:* 22 February 2021 05:53
*To:* ghc-devs <ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>>
*Subject:* Re: On CI
I'm not opposed to some effort going into this, but I would
strongly opposite putting all our effort there. Incremental CI can
cut multiple hours to < mere minutes, especially with the test
suite being embarrassingly parallel. There simply no way
optimizations to the compiler independent from sharing a cache
between CI runs can get anywhere close to that return on investment.
(FWIW, I'm also skeptical that the people complaining about GHC
performance know what's hurting them most. For example, after
non-incrementality, the next slowest thing is linking, which
is...not done by GHC! But all that is a separate conversation.)
John
On 2/19/21 2:42 PM, Richard Eisenberg wrote:
There are some good ideas here, but I want to throw out
another one: put all our effort into reducing compile times.
There is a loud plea to do this on Discourse
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdiscourse.haskell.org%2Ft%2Fcall-for-ideas-forming-a-technical-agenda%2F1901%2F24&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691120329%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1CV0MEVUZpbAbmKAWTIiqLgjft7IbN%2BCSnvB3W3iX%2FU%3D&reserved=0>,
and it would both solve these CI problems and also help
everyone else.
This isn't to say to stop exploring the ideas here. But since
time is mostly fixed, tackling compilation times in general
may be the best way out of this. Ben's survey of other
projects (thanks!) shows that we're way, way behind in how
long our CI takes to run.
Richard
On Feb 19, 2021, at 7:20 AM, Sebastian Graf
<sgraf1...@gmail.com <mailto:sgraf1...@gmail.com>> wrote:
Recompilation avoidance
I think in order to cache more in CI, we first have to
invest some time in fixing recompilation avoidance in our
bootstrapped build system.
I just tested on a hadrian perf ticky build: Adding one
line of *comment* in the compiler causes
* a (pretty slow, yet negligible) rebuild of the stage1
compiler
* 2 minutes of RTS rebuilding (Why do we have to rebuild
the RTS? It doesn't depend in any way on the change I
made)
* apparent full rebuild the libraries
* apparent full rebuild of the stage2 compiler
That took 17 minutes, a full build takes ~45minutes. So
there definitely is some caching going on, but not nearly
as much as there could be.
I know there have been great and boring efforts on
compiler determinism in the past, but either it's not good
enough or our build system needs fixing.
I think a good first step to assert would be to make sure
that the hash of the stage1 compiler executable doesn't
change if I only change a comment.
I'm aware there probably is stuff going on, like embedding
configure dates in interface files and executables, that
would need to go, but if possible this would be a huge
improvement.
On the other hand, we can simply tack on a [skip ci] to
the commit message, as I did for
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4975
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2F-%2Fmerge_requests%2F4975&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691130329%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=bgT0LeZXjF%2BMklzctvZL6WaVpaddN7%2FSpojcEXGXv7Q%3D&reserved=0>.
Variants like [skip tests] or [frontend] could help to
identify which tests to run by default.
Lean
I had a chat with a colleague about how they do CI for
Lean. Apparently, CI turnaround time including tests is
generally 25 minutes (~15 minutes for the build) for a
complete pipeline, testing 6 different OSes and
configurations in parallel:
https://github.com/leanprover/lean4/actions/workflows/ci.yml
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fleanprover%2Flean4%2Factions%2Fworkflows%2Fci.yml&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691140326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=9MEWPlRhO2xZK2iu5OqzXS9RZqc9pKNJcGDv7Nj3hyA%3D&reserved=0>
They utilise ccache to cache the clang-based C++-backend,
so that they only have to re-run the front- and
middle-end. In effect, they take advantage of the fact
that the "function" clang, in contrast to the "function"
stage1 compiler, stays the same.
It's hard to achieve that for GHC, where a complete
compiler pipeline comes as one big, fused "function": An
external tool can never be certain that a change to
Parser.y could not affect the CodeGen phase.
Inspired by Lean, the following is a bit inconcrete and
imaginary, but maybe we could make it so that compiler
phases "sign" parts of the interface file with the binary
hash of the respective subcomponents of the phase?
E.g., if all the object files that influence CodeGen (that
will later be linked into the stage1 compiler) result in a
hash of 0xdeadbeef before and after the change to
Parser.y, we know we can stop recompiling Data.List with
the stage1 compiler when we see that the IR passed to
CodeGen didn't change, because the last compile did
CodeGen with a stage1 compiler with the same hash
0xdeadbeef. The 0xdeadbeef hash is a proxy for saying "the
function CodeGen stayed the same", so we can reuse its
cached outputs.
Of course, that is utopic without a tool that does the
"taint analysis" of which modules in GHC influence
CodeGen. Probably just including all the transitive
dependencies of GHC.CmmToAsm suffices, but probably that's
too crude already. For another example, a change to
GHC.Utils.Unique would probably entail a full rebuild of
the compiler because it basically affects all compiler phases.
There are probably parallels with recompilation avoidance
in a language with staged meta-programming.
Am Fr., 19. Feb. 2021 um 11:42 Uhr schrieb Josef
Svenningsson via ghc-devs <ghc-devs@haskell.org
<mailto:ghc-devs@haskell.org>>:
Doing "optimistic caching" like you suggest sounds
very promising. A way to regain more robustness would
be as follows.
If the build fails while building the libraries or the
stage2 compiler, this might be a false negative due to
the optimistic caching. Therefore, evict the
"optimistic caches" and restart building the
libraries. That way we can validate that the build
failure was a true build failure and not just due to
the aggressive caching scheme.
Just my 2p
Josef
------------------------------------------------------------------------
*From:* ghc-devs <ghc-devs-boun...@haskell.org
<mailto:ghc-devs-boun...@haskell.org>> on behalf of
Simon Peyton Jones via ghc-devs <ghc-devs@haskell.org
<mailto:ghc-devs@haskell.org>>
*Sent:* Friday, February 19, 2021 8:57 AM
*To:* John Ericson <john.ericson@obsidian.systems
<mailto:john.ericson@obsidian.systems>>; ghc-devs
<ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>>
*Subject:* RE: On CI
1. Building and testing happen together. When tests
failure spuriously, we also have to rebuild GHC in
addition to re-running the tests. That's pure
waste.
https://gitlab.haskell.org/ghc/ghc/-/issues/13897
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2F-%2Fissues%2F13897&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691140326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Nm6vfgGLLlJpiGa8XKxI6kNkBetp8ZZLPZS8hF%2BydrM%3D&reserved=0>
tracks this more or less.
I don’t get this. We have to build GHC before we can
test it, don’t we?
2 . We don't cache between jobs.
This is, I think, the big one. We endlessly build
the exact same binaries.
There is a problem, though. If we make **any** change
in GHC, even a trivial refactoring, its binary will
change slightly. So now any caching build system will
assume that anything built by that GHC must be rebuilt
– we can’t use the cached version. That includes all
the libraries and the stage2 compiler. So caching can
save all the preliminaries (building the initial
Cabal, and large chunk of stage1, since they are built
with the same bootstrap compiler) but after that we
are dead.
I don’t know any robust way out of this. That small
change in the source code of GHC might be trivial
refactoring, or it might introduce a critical
mis-compilation which we really want to see in its
build products.
However, for smoke-testing MRs, on every architecture,
we could perhaps cut corners. (Leaving Marge to do
full diligence.) For example, we could declare that
if we have the result of compiling library module X.hs
with the stage1 GHC in the last full commit in master,
then we can re-use that build product rather than
compiling X.hs with the MR’s slightly modified stage1
GHC. That **might** be wrong; but it’s usually right.
Anyway, there are big wins to be had here.
Simon
*From:*ghc-devs <ghc-devs-boun...@haskell.org
<mailto:ghc-devs-boun...@haskell.org>> *On Behalf Of
*John Ericson
*Sent:* 19 February 2021 03:19
*To:* ghc-devs <ghc-devs@haskell.org
<mailto:ghc-devs@haskell.org>>
*Subject:* Re: On CI
I am also wary of us to deferring checking whole
platforms and what not. I think that's just kicking
the can down the road, and will result in more
variance and uncertainty. It might be alright for
those authoring PRs, but it will make Ben's job
keeping the system running even more grueling.
Before getting into these complex trade-offs, I think
we should focus on the cornerstone issue that CI isn't
incremental.
1. Building and testing happen together. When tests
failure spuriously, we also have to rebuild GHC in
addition to re-running the tests. That's pure
waste.
https://gitlab.haskell.org/ghc/ghc/-/issues/13897
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2F-%2Fissues%2F13897&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691150320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=KlQGW1amK%2BtlRTGl4cDgMyl%2Bfz17fuUAHFNAaNXbzZI%3D&reserved=0>
tracks this more or less.
2. We don't cache between jobs. Shake and Make do not
enforce dependency soundness, nor
cache-correctness when the build plan itself
changes, and this had made this hard/impossible to
do safely. Naively this only helps with stage 1
and not stage 2, but if we have separate stage 1
and --freeze1 stage 2 builds, both can be
incremental. Yes, this is also lossy, but I only
see it leading to false failures not false
acceptances (if we can also test the stage 1 one),
so I consider it safe. MRs that only work with a
slow full build because ABI can so indicate.
The second, main part is quite hard to tackle, but I
strongly believe incrementality is what we need most,
and what we should remain focused on.
John
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691160313%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=uE1IOblLTYJ2j3H2vkFKgQyVZs5sehXd1Tl70X0kUqE%3D&reserved=0>
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691160313%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=uE1IOblLTYJ2j3H2vkFKgQyVZs5sehXd1Tl70X0kUqE%3D&reserved=0>
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=04%7C01%7Csimonpj%40microsoft.com%7C9d7043627f5042598e5b08d8d6f648c4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637495701691170308%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Yrob9grqAWOxZnFXcM%2BZ60VNsrhIejcmwkSIR3Wq0gA%3D&reserved=0>
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
<http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs>