Hi Helmut,

Happy new year!

On Sat, Dec 16, 2023 at 07:34PM, Helmut Grohne wrote:
> I fear this is misleadingly imprecise. If I understand what you write
> later correctly, we always build GHC for the same architecture (i.e. we
> ask configure for build=host), but we may opt for a cross compiler (i.e.
> target!=host). When the ghc Debian package is asked to cross compile,
> what we ask configure instead is natively building a cross compiler. Do
> you confirm?

Yes I confirm. Up until now, this was the way to get both a cross
compiler (the STAGE1_TOOL compiler) and a cross-compiled GHC (the
STAGE2_TOOL compiler) [1], and we installed in the Debian package the
STAGE2_TOOL compiler.

[1] 
https://gitlab.haskell.org/ghc/ghc/-/wikis/building/cross-compiling#terminology-and-background

This changed with the new build-system, and now we get only a cross
compiler both from stage1 and stage2. One can of course argue that this
is the expected output (given we passed build=host) :)

> This is still quite confusing to me. I suppose stage0 is /usr/bin/ghc.
> Do you confirm? Then there should be a _build/stage0 containing the
> stage1 compiler. That stage1 compiler probably is a simple native build
> of the ghc sources at hand, which means that its binary may have a
> different ABI from what the sources expect. That stage1 will be unable
> to use packages, but it can be used to build ghc again. Is that also
> correct? Then that stage1 is used to build a stage2 where the ABI of the
> binary matches the behaviour and this can be used with packages.

> Do I understand correctly that (currently) stage1 is a simple native
> build of the current ghc sources where the binary uses the ABI that
> /usr/bin/ghc generated and doesn't necessarily match the sources? Do I
> also understand that stage2 (stored in _build/stage1) the is a cross
> compiler generating code for $DEB_HOST_ARCH and runnable on
> $DEB_BUILD_ARCH using the ABI given by the current ghc sources?

Yes on all of the above, this is my understanding as well.

> At no point do we (currently) actually cross build using that stage2. Do
> you also confirm?

No we don't. I have tried doing that by:

1. Trying to build stage 3, but the resulting compiler was still a cross
compiler, see [2].

2. Trying to use the resulting stage 2 compiler as a bootstrap compiler
(stage 0) for a new build, see [3].

Both of the above approaches failed. This is where we are currently
blocked, and I admit I haven't tried to move past this.

[2] https://gitlab.haskell.org/ghc/ghc/-/issues/23975#note_526549
[3] https://gitlab.haskell.org/ghc/ghc/-/issues/23975#note_530085

> As far as I can see, the name "STAGE1_TOOL" is misleading and it should
> be called "STAGE2_TOOL" instead. Do you concur? Then a STAGE2_TOOL
> always is something that runs on the build machine and operates on
> $DEB_HOST_ARCH, which seems just about right, no? STAGE2_TOOL not
> necessarily is something we'd want to install into a .deb though. Do you
> agree?

Yes I agree.

> > Calling 'ghc-pkg recache' here is wrong. I suppose we can skip this step
> > if we are cross-compiling (so we can reach the next failure).
> 
> Can you elaborate on why we do not want to reset the package cache here?

What we want to do here is to be able to run ghc-pkg and query the GHC
database for the library ABIs. But this needs to be done for the
HOST/TARGET architecture. I am not really sure if running the
STAGE2_TOOL ghc-pkg here (which runs on the BUILD architecture) will
produce the same ABIs as the STAGE3_TOOL ghc-pkg (if we had one).

> Most of the time, this is true. However, we can also cross build from
> amd64 to i386 or arm64 to armhf (which gets us 64bit address space
> during build) and we can build with a qemu-user-static installed. In
> both cases, we can actually run tests. Therefore the decision whether to
> run tests is left to the builder. Both sbuild and pbuilder default to
> not running tests by automatically adding nocheck to DEB_BUILD_OPTIONS
> when you ask for a cross build. This whole block is conditional to
> DEB_BUILD_OPTIONS not containing nocheck, so the only way this is
> relevant is when a builder overrides this default and thus explicitly
> requests running tests despite performing a cross build. And in that
> case, the proposed patch should make sense, no? Your proposed skipping
> already is implemented via nocheck.

You are right, sbuild by default will pass DEB_BUILD_OPTIONS=nocheck so
no need to explicitly disable our tests. And if one explicitly asks for
them, we can definitely find a way to support this.

> Thank you for considering "my way". You have made a good case for
> understanding the end-to-end process and you definitely convinced me
> that this is necessary. Often times, the incremental process just works
> and here it likely is not ideal, but the discussion still seems to
> advance us and I would be more than happy to continue and improve our
> understanding to reach that state where we make that end-to-end process
> work practically. Hope you can bear with me.
> 
> Please take your time to respond even if that happens to be next year.
> This is not something we have to fix right now. I prefer a good and
> maintainable solution over a quick solution.

Thank you for helping here. This discussion really helps move things
forward.

I have now uploaded GHC 9.6.4-1~exp1 to experimental where I disabled
the explicit error and let it fail at a later step. I didn't fix the
tests for now, let me know if this causes problems with our QA system.

-- 
Ilias

Reply via email to