Hi,

On Mon, Oct 01, 2012 at 04:32:51PM +0200, David Coppa wrote:
> > >> Loading package integer-gmp ... linking ... ghc: 
> > >> /usr/local/lib/libgmp.a: unknown symbol `__guard_local'
[...]
> > I applied the patches to both amd64 and i386, the two failing ports
> > hs-vector and hs-type-level built on i386, not on amd64.
[...]
> The diff below, replacing SymE_NeedsProto with SymI_NeedsProto,
> makes things work for me on amd64 (tested with hs-vector and
> hs-type-level)...

For me, both versions (using either SymI_NeedsProto or SymE_NeedsProto)
*appear* to work (on amd64).  I don't understand why the SymE_NeedsProto
did fail for Nigel on amd64.

Note: if you don't want to read all the details, just skip forward
to "What can we do?" ;-)

Anyway, that's not important, because both "fixes" are wrong, because
they fix the linker error (unknown symbol `__guard_local') but cause
random crashes at runtime if there's heavy enough use of libgmp.a (that
just tries to calculate the 400th fibonacci number):

$ cd $(make show=WRKBUILD)
$ echo 'let f = 1 : 1 : zipWith (+) f (tail f) in f !! 400' | 
./inplace/bin/ghc-stage2 --interactive
GHCi, version 7.4.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude>
284812298108489611757988937681460995615380088782304890986477195645969271404032323901
Prelude> Leaving GHCi.
$ echo 'let f = 1 : 1 : zipWith (+) f (tail f) in f !! 400' | 
./inplace/bin/ghc-stage2 --interactive
GHCi, version 7.4.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> Segmentation fault (core dumped)


> *GIANT* WARNING:
> http://blogs.gnome.org/diegoe/files/2012/09/no-idea-what-im-doing-dog.jpg

Yeah, same here. I didn't yet manage to read and understand all of
rts/Linker.c (well, I tried, but it's no fun reading that code).

But there are some comments about all the Sym[IE]_(Needs|Has)Proto
macros in it, especially a comment about a nasty hack for amd64 and
that this probably will fail if the exposed symbol is *not* a
function.

Anyway, what's the purpose of rts/Linker.c? Initially, it was used
for ghci only, to load *statically* linked *Haskell* libraries and
object files at runtime. Unfortunately, today it's also used for a
language extension called "Template Haskell", which requires the
*compiler* to execute Haskell code during compile time, requiring
the compiler itself to load additional (statically linked) Haskell
libraries.

Loading *Haskell* libraries with ghc's rts/Linker.c worked and still
works fine, because the code generated by ghc does not reference
__guard_local; AFAIK, it doesn't even contain any normal functions.
Instead, it does some calculations and then jumps to another chunk
of code. It also maintains its own stack (not comparable to the
stack we know from C).

To confirm yourself that there's no reference to __guard_local, try

$ nm /usr/local/lib/ghc/integer-gmp-0.4.0.0/libHSinteger-gmp-0.4.0.0.a | grep 
-c __guard_local

However, when using the Haskell library integer-gmp (or any other
library listed in the field "extra-libraries" of a Haskell library
package info, that extra library *will* contain references to
__guard_local. This works when linking normal executables (because
ghc uses ld or cc for this), but it doesn't work when rts/Linker.c
is involved (which in this case has to load both libHSinteger-gmp-0.4.0.0.a
and libgmp.a).


What can we do?

There are several options:

1 Expose __guard_local like your and my diff does, to apparently
  "fix" ports which need Template Haskell at configure time. This
  is the least attractive option, because this approach is just
  broken (see the fibonacci examples above).

2 Read and understand rts/Linker.c and handle __guard_local correctly.
  Not very attractive, because it's a lot of pain and I really want
  to avoid touching this code.

3 Enable shared library support in ghc, so our ld.so(1) could take
  over. Unfortunately, there are two problems:

  a) I tried it, and got horrible failures doring the build.

  b) On some of the GHC mailinglists, someone recently mentioned
     that ghci (and Code using Template Haskell) still loads libraries
     via rts/Linker.c, even if shared llibraries are enabled.

  c) I'm still very scared about the consequences of enabling shared
     library support for *Haskell* libraries.

4 Ignore the ghci problem for now but modify Template Haskell to
  *not* trying to interpret Haskell code embedded in Haskell code
  but to create a temporary preprocessor which then translates the
  original source code to pure Haskell code. That would be a pretty
  opportunity to learn more about Tempate Haskell, but it would
  never been accepted upstream, because it would slow down compilation
  a lot.

5 Let ghci (and ghc + Template Haskell) still use rts/Linker.c for
  loading *Haskell* libraries, but use dlopen(3) for non-Haskell
  extra-libraries. But wait! We no have to have to expose the symbols
  from the dlopen'ed library to the static Haskell library.


So, after all, this is almost a lost case.

Ciao,
        Kili

Reply via email to