[Haskell-cafe] Compiler backend question

2008-01-01 Thread Peter Verswyvelen
If I understand it correctly, the GHC compiler either directly generates 
machinecode, or it uses C as an intermediate language.


I also read somewhere that C is not the most efficient intermediate 
representation for functional languages, and that one gets better 
performance when generating native machine code.


However, from the discussions in this list, I could conclude that the 
low level machine code optimizer of GHC is far from state-of-the-art 
compared to industry standard C/C++ compilers.


I was wondering, why doesn't GHC use the GCC (or any other standard 
compiler) backend intermediate code? The backend of GCC generates highly 
optimized code no? Or is the intermediate code format of GCC (or other 
compilers) not suitable for Haskell?


Another question regarding the backend: a cool feature of the Microsoft 
Visual C++ (MVC) compiler is its ability to perform LTCG 
(link-time-code-generation), performing whole program optimization. It 
something like this possible with Haskell / (or the GCC backend?). Would 
it be possible to feed all the generated C code of the GHC compiler into 
the MVC compiler, to generate one big MVC / LTCG generated executable? 
It would be interesting to see how much the whole program optimization 
approach (which can do analysis on the program as if it was a single 
module) would improve performance...


Cheers,
Peter


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiler backend question

2008-01-01 Thread Neil Mitchell
Hi

  If I understand it correctly, the GHC compiler either directly generates
 machinecode, or it uses C as an intermediate language.

  I also read somewhere that C is not the most efficient intermediate
 representation for functional languages, and that one gets better
 performance when generating native machine code.

  However, from the discussions in this list, I could conclude that the low
 level machine code optimizer of GHC is far from state-of-the-art compared to
 industry standard C/C++ compilers.

These all seem fair comments, based on my understanding. One other
thing to remember is that the GHC people know this, and much work has
been done, and will continue to be done until the back end is up to
the standard of the front end.

  I was wondering, why doesn't GHC use the GCC (or any other standard
 compiler) backend intermediate code? The backend of GCC generates highly
 optimized code no? Or is the intermediate code format of GCC (or other
 compilers) not suitable for Haskell?

GCC is optimised for dealing with code that comes from C, and the back
end language is much like C. GCC is also not really set up to be used
by bolting different front ends on to different back ends - part of
this is a license issue - if the front and back ends were well split
and available separately you could write a commerical backend onto a
GPL front end.

  Another question regarding the backend: a cool feature of the Microsoft
 Visual C++ (MVC) compiler is its ability to perform LTCG
 (link-time-code-generation), performing whole program optimization. It
 something like this possible with Haskell / (or the GCC backend?). Would it
 be possible to feed all the generated C code of the GHC compiler into the
 MVC compiler, to generate one big MVC / LTCG generated executable? It would
 be interesting to see how much the whole program optimization approach
 (which can do analysis on the program as if it was a single module) would
 improve performance...

You would get much better improvements on whole program at the Haskell
level. By the time you get to the generated C you have obfustcated all
the structure of the program, and are unlikely to benefit that much.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiler backend question

2008-01-01 Thread Richard Kelsall

Peter Verswyvelen wrote:
...
I was wondering, why doesn't GHC use the GCC (or any other standard 
compiler) backend intermediate code? The backend of GCC generates highly 
optimized code no? Or is the intermediate code format of GCC (or other 
compilers) not suitable for Haskell?

...

My guess is that there is scope for producing assembly code that
out-performs GCC. I speak as someone who has no actual knowledge
of how GCC works :) I have a theory that the back-end of most compilers,
including GCC, are an accretion of many hand-coded hard-coded functions
to perform very specific optimisations for particular instructions for
particular processors. These have been developed over the years by
relatively ad-hoc timings of the precise assembly code the individual
who wrote the function wanted to improve.

So what's wrong with this approach?

When a new processor is introduced it needs a different set of
optimisations, which take time to develop, which means the compiler
is sub-optimal for a while for new processors.

Also, the number of processors we would like the compiler optimised
for is large if we're really trying to get the last few percent from
every processor. Modern processors aren't simple things like the old
6502 or Z80 which had definite, easily predictable, timings. Think
different cache sizes, multiple pipelines, branch prediction etc.

The microarchitecture of Intel and AMD CPU’s: An optimization guide
for assembly programmers and compiler makers here
http://www.agner.org/optimize/

It seems unlikely therefore that the code generation for every modern
processor variant will be hand-optimised to account for all the small
details. Even in GCC, which no doubt has a huge amount of continuous
effort applied to it by many clever people, they are never going to
hand-code every complexity of every processor.

Take a small example : integer multiplication. Looking at the table
in the conclusion of the microarchitecture PDF we see that different
x86 processors have different relative speed of adding and multiplying.
Something like this

Execution latencies, typical   AMD64   P4EPM   Core2
integer addition 1  1  1  1
integer multiplication   3 10  4  3

We can compute
4 * 371
faster on P4E by doing
371 + 371 + 371 + 371
but on Core2 and AMD64 we should do
4 * 371
and on PM it might depend on other factors, I don't know. Now supposing
we allow powers of two by shifting as well as integer addition, can we
immediately say the best way of multiplying two integers on all these
processors? Maybe roughly, but do we always get the absolute fastest
code? For example it will depend on the values of the integers.
If we know one of them in advance as I've assumed above, or if we
know they frequently have particular values we might handle them
first etc. This is just one small optimisation out of many. And the
optimisations might interact.

So the inevitable conclusion is that the precise assembly code
optimisations must not be hand-coded directly into the back-end of
the compiler, as I expect they are currently in most compilers, but
that they need to be expressed at a higher level repeated addition
is equivalent to multiplication and adapted down to particular
processors and situations based on extensive, genetic algorithm
guided, timing runs.

Anyway, that's my wild idea thrown in the pot. Might be a nice research
paper in there for someone :)


Richard.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Compiler backend question

2008-01-01 Thread Peter Verswyvelen
Neil Mitchell wrote:
 GCC is optimised for dealing with code that comes from C, and the back
 end language is much like C. GCC is also not really set up to be used
 by bolting different front ends on to different back ends - part of
 this is a license issue - if the front and back ends were well split
 and available separately you could write a commerical backend onto a
 GPL front end.

Well, I don't know about the licensing, but according to
http://en.wikipedia.org/wiki/GNU_Compiler_Collection#Front_ends, a new
cleaner intermediate language was created in 2005 for GCC, which might be
more general?

 You would get much better improvements on whole program at the Haskell
 level. By the time you get to the generated C you have obfustcated all
 the structure of the program, and are unlikely to benefit that much.

Yes that would be great! Or even at JIT time ;-) But, given the time GHC
takes to optimize a single module, I guess it would not really be practical
to let those optimization take place on the full program? It would take ages
no?

I just wanted to see if it is *possible* to feed just all the C code from
GHC into a C compiler, and then generate a single executable from that C
code (including the GHC runtime). For example, for the XBOX DEVKIT,
Microsoft shipped LTCG versions for all the core libraries, which gave a
nice performance improvement for free (if I'm not mistaking, this gives 10%
to 15% faster code). Actually this LTCG thingy was first supported by the
Intel compiler I believe.

Thanks,
Peter





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiler backend question

2008-01-01 Thread Bryan O'Sullivan
Peter Verswyvelen wrote:

 Well, I don't know about the licensing, but according to
 http://en.wikipedia.org/wiki/GNU_Compiler_Collection#Front_ends, a new
 cleaner intermediate language was created in 2005 for GCC, which might be
 more general?

It's still very difficult to work with GCC from the perspective of an
external tool.  Its current IR is still targeted towards the languages
it currently supports and the needs of its back end, so it omits a fair
bit of information that third-party tools would find useful.

 I just wanted to see if it is *possible* to feed just all the C code from
 GHC into a C compiler, and then generate a single executable from that C
 code (including the GHC runtime).

It would be a fair bit of work.  My recollection is that GHC knows
that it is talking to GCC when you go through -fvia-c, so it uses some
gcc extensions to handle things like register reservation.  A few
compilers support some of those gcc extensions, so you might have some
level of success with those.  The Intel and PathScale compilers do, for
example, and LLVM's clang front end has some level of gcc compatibility
already.

In the case of the PathScale compiler (which I used to work on), when
invoked in whole-program mode it emits files that look like regular .o
files, but are in fact just the serialised IR for a compilation unit.
It's also open source, so it would be easier to tinker with than the
Intel compiler.

 Actually this LTCG thingy was first supported by the
 Intel compiler I believe.

No, that kind of whole-program optimisation long predates its inclusion
in the Intel compiler.  The earliest I saw it in deployment was in MIPS
compilers circa 1992, and I don't recall it being a new idea at the time.

Anyway, the point about GCC being an unsuitable vehicle for this sort of
thing remains.  You'd do better looking at LLVM, for which I've lately
been working on nice Haskell bindings:

darcs get http://darcs.serpentine.com/llvm

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe