Hi Helmut, Happy new year!
On Sat, Dec 16, 2023 at 07:34PM, Helmut Grohne wrote: > I fear this is misleadingly imprecise. If I understand what you write > later correctly, we always build GHC for the same architecture (i.e. we > ask configure for build=host), but we may opt for a cross compiler (i.e. > target!=host). When the ghc Debian package is asked to cross compile, > what we ask configure instead is natively building a cross compiler. Do > you confirm? Yes I confirm. Up until now, this was the way to get both a cross compiler (the STAGE1_TOOL compiler) and a cross-compiled GHC (the STAGE2_TOOL compiler) [1], and we installed in the Debian package the STAGE2_TOOL compiler. [1] https://gitlab.haskell.org/ghc/ghc/-/wikis/building/cross-compiling#terminology-and-background This changed with the new build-system, and now we get only a cross compiler both from stage1 and stage2. One can of course argue that this is the expected output (given we passed build=host) :) > This is still quite confusing to me. I suppose stage0 is /usr/bin/ghc. > Do you confirm? Then there should be a _build/stage0 containing the > stage1 compiler. That stage1 compiler probably is a simple native build > of the ghc sources at hand, which means that its binary may have a > different ABI from what the sources expect. That stage1 will be unable > to use packages, but it can be used to build ghc again. Is that also > correct? Then that stage1 is used to build a stage2 where the ABI of the > binary matches the behaviour and this can be used with packages. > Do I understand correctly that (currently) stage1 is a simple native > build of the current ghc sources where the binary uses the ABI that > /usr/bin/ghc generated and doesn't necessarily match the sources? Do I > also understand that stage2 (stored in _build/stage1) the is a cross > compiler generating code for $DEB_HOST_ARCH and runnable on > $DEB_BUILD_ARCH using the ABI given by the current ghc sources? Yes on all of the above, this is my understanding as well. > At no point do we (currently) actually cross build using that stage2. Do > you also confirm? No we don't. I have tried doing that by: 1. Trying to build stage 3, but the resulting compiler was still a cross compiler, see [2]. 2. Trying to use the resulting stage 2 compiler as a bootstrap compiler (stage 0) for a new build, see [3]. Both of the above approaches failed. This is where we are currently blocked, and I admit I haven't tried to move past this. [2] https://gitlab.haskell.org/ghc/ghc/-/issues/23975#note_526549 [3] https://gitlab.haskell.org/ghc/ghc/-/issues/23975#note_530085 > As far as I can see, the name "STAGE1_TOOL" is misleading and it should > be called "STAGE2_TOOL" instead. Do you concur? Then a STAGE2_TOOL > always is something that runs on the build machine and operates on > $DEB_HOST_ARCH, which seems just about right, no? STAGE2_TOOL not > necessarily is something we'd want to install into a .deb though. Do you > agree? Yes I agree. > > Calling 'ghc-pkg recache' here is wrong. I suppose we can skip this step > > if we are cross-compiling (so we can reach the next failure). > > Can you elaborate on why we do not want to reset the package cache here? What we want to do here is to be able to run ghc-pkg and query the GHC database for the library ABIs. But this needs to be done for the HOST/TARGET architecture. I am not really sure if running the STAGE2_TOOL ghc-pkg here (which runs on the BUILD architecture) will produce the same ABIs as the STAGE3_TOOL ghc-pkg (if we had one). > Most of the time, this is true. However, we can also cross build from > amd64 to i386 or arm64 to armhf (which gets us 64bit address space > during build) and we can build with a qemu-user-static installed. In > both cases, we can actually run tests. Therefore the decision whether to > run tests is left to the builder. Both sbuild and pbuilder default to > not running tests by automatically adding nocheck to DEB_BUILD_OPTIONS > when you ask for a cross build. This whole block is conditional to > DEB_BUILD_OPTIONS not containing nocheck, so the only way this is > relevant is when a builder overrides this default and thus explicitly > requests running tests despite performing a cross build. And in that > case, the proposed patch should make sense, no? Your proposed skipping > already is implemented via nocheck. You are right, sbuild by default will pass DEB_BUILD_OPTIONS=nocheck so no need to explicitly disable our tests. And if one explicitly asks for them, we can definitely find a way to support this. > Thank you for considering "my way". You have made a good case for > understanding the end-to-end process and you definitely convinced me > that this is necessary. Often times, the incremental process just works > and here it likely is not ideal, but the discussion still seems to > advance us and I would be more than happy to continue and improve our > understanding to reach that state where we make that end-to-end process > work practically. Hope you can bear with me. > > Please take your time to respond even if that happens to be next year. > This is not something we have to fix right now. I prefer a good and > maintainable solution over a quick solution. Thank you for helping here. This discussion really helps move things forward. I have now uploaded GHC 9.6.4-1~exp1 to experimental where I disabled the explicit error and let it fail at a later step. I didn't fix the tests for now, let me know if this causes problems with our QA system. -- Ilias