Re: enormous executable

2001-10-04 Thread Alastair David Reid


Simon (talking about using -ffunction-sections)
 And there's bound to be some
 complication due to the assumptions we make in the RTS about the
 relative ordering of code/data.

Sounds like the mangler should do the function section magic.
Assuming the mangler understands where section boundaries can
and cannot go (I think this is true), this should be quite easy.

If you run this:
  
  $ cat  /tmp/tst.c
  int f(int x) {return x;}
  int g(int x) {return x;}
  $ gcc -ffunction-sections -o - -S /tmp/tst.c 
  
You'll see that the -ffunction-sections flag causes gcc to output these
section directives before the code implementing f and g.

  .section.text.f,ax,@progbits
  .section.text.g,ax,@progbits

The corresponding GNU linker magic constructs the .text segment out of
all the .text.* segments.

-- 
Alastair Reid[EMAIL PROTECTED]http://www.cs.utah.edu/~reid/

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-03 Thread Simon Marlow


 I don't think the native ld on alpha-dec-osf3 supports such a feature,
 so we would (I assume) have to leave -split-objs in ghc even if we do
 implement -ffunction-sections/-fdata-sections.  (Would it just be a
 matter of enabling it when invoking gcc?  Would I be able to try it on
 my i386-linux box with -optc-ffunction-sections?  Or I suppose the
 mangler would need to be educated...)

AFAIK it only works on ELF, and only with GNU ld.  You could take
advantage of it when doing unregisterised compilation, but otherwise we
have to teach the mangler about it.  And there's bound to be some
complication due to the assumptions we make in the RTS about the
relative ordering of code/data.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-02 Thread Simon Marlow

 | Surely the executable itself is only linked with the 
 | functions that are actually used by the program?  
 
 AFAIUI the GNU linker is not clever enough to remove junk
 on a per-function basis, only on a per-object basis.  This is
 why we do object-splitting -- by breaking libraries up into 
 thousands of .o files before rolling them into a .a, the
 effectiveness of what GNU ld can do is enhanced.
 
 Perhaps more recent GNU ld's do better on some platforms?
 I have a vague recollection of some -gc-sections flag.

Yup, but it needs compiler support.  The idea is to get the compiler to
put each function in its own section, then the linker removes unused
sections from the linked image.

Cheers,
Simon


___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-02 Thread Marcin 'Qrczak' Kowalczyk

Mon, 1 Oct 2001 09:55:11 +0100, Simon Marlow [EMAIL PROTECTED] pisze:

 We don't have support for shared libraries under Unix at the moment.  It
 has been investigated at various times in the past, and I believe the
 story is that we couldn't do it without at least losing some performance
 (more performance loss than you get from compiling C into a shared
 library).

There is also a big problem of binary incompatibility between different
versions of ghc, and dependence on the exact version of modules our
package depends on.

Dynamic libraries work for C because C has a very stable ABI. Also
it's easy to maintain binary compatibility in libraries. I'm afraid
that cross-module optimizations of ghc make dynamic libraries nearly
impractical.

-- 
 __(  Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
 \__/
  ^^  SYGNATURA ZASTÊPCZA
QRCZAK


___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-02 Thread Manuel M. T. Chakravarty

Simon Marlow [EMAIL PROTECTED] wrote,

  | Surely the executable itself is only linked with the 
  | functions that are actually used by the program?  
  
  AFAIUI the GNU linker is not clever enough to remove junk
  on a per-function basis, only on a per-object basis.  This is
  why we do object-splitting -- by breaking libraries up into 
  thousands of .o files before rolling them into a .a, the
  effectiveness of what GNU ld can do is enhanced.
  
  Perhaps more recent GNU ld's do better on some platforms?
  I have a vague recollection of some -gc-sections flag.
 
 Yup, but it needs compiler support.  The idea is to get the compiler to
 put each function in its own section, then the linker removes unused
 sections from the linked image.

Sounds much better than the mess that -split-objs produces
on the harddisk.

Cheers,
Manuel

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-02 Thread Ken Shan

On 2001-10-03T10:37:24+1000, Manuel M. T. Chakravarty wrote:
 Simon Marlow [EMAIL PROTECTED] wrote,
   | Surely the executable itself is only linked with the 
   | functions that are actually used by the program?  
   AFAIUI the GNU linker is not clever enough to remove junk
   on a per-function basis, only on a per-object basis.  This is
   why we do object-splitting -- by breaking libraries up into 
   thousands of .o files before rolling them into a .a, the
   effectiveness of what GNU ld can do is enhanced.
   Perhaps more recent GNU ld's do better on some platforms?
   I have a vague recollection of some -gc-sections flag.
  Yup, but it needs compiler support.  The idea is to get the compiler to
  put each function in its own section, then the linker removes unused
  sections from the linked image.
 Sounds much better than the mess that -split-objs produces
 on the harddisk.

I don't think the native ld on alpha-dec-osf3 supports such a feature,
so we would (I assume) have to leave -split-objs in ghc even if we do
implement -ffunction-sections/-fdata-sections.  (Would it just be a
matter of enabling it when invoking gcc?  Would I be able to try it on
my i386-linux box with -optc-ffunction-sections?  Or I suppose the
mangler would need to be educated...)

-- 
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig

 PGP signature


RE: enormous executable

2001-10-01 Thread Simon Marlow

   -rwxr-xr-x1 willem   users  201453 Sep 29 09:55 hello
   -rw-r--r--1 willem   users  51 Sep 29 09:54 hello.hs
   -rw-r--r--1 willem   users1468 Sep 29 09:55 hello.o
  
  Probably, the executable  hello  is 1000 times larger than object
  one because some piece of  library (including binary code for 
  outputting a string) is linked to it.
  In small user programs the library code is usually the larger 
  part. In large user programs, it will be the smaller part.
 
  Serge Mechveliani
 
 I think so too, but what I'm surprised about is that GHC apparently
 can't use printing functions and all those other things I saw in
 `strings hello`, for example things about sockets, forks and file IO,
 from a library instead of compiling it into the 'hello' binary.

We don't have support for shared libraries under Unix at the moment.  It
has been investigated at various times in the past, and I believe the
story is that we couldn't do it without at least losing some performance
(more performance loss than you get from compiling C into a shared
library).

So what you're seeing is static linking in action.  We do have some
hacks in our library building to split objects into small chunks so
that static linking doesn't result in *huge* binaries (see the
-split-objs flag), but this isn't being used in GTK+HS, hence the 2Mb
binaries.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Julian Seward (Intl Vendor)


| OTOH when you start using and reusing larger libraries, like 
| Gtk+, the waste accumulates, and it's possibly you don't need 
| to run too many GTK+HS programs before you see performance decrease 
| due to memory exhaustion. 

On Linux and probably most Unixes, the text and data segments
of the executable are loaded page-by-page into memory on
demand.  So having a lot of unused junk in the executable doesn't
necessarily increase the memory used, either real memory or
swap space.  I think it's basically harmless providing you're
only running one instance of the program on the machine.

J

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-01 Thread Ketil Malde

Julian Seward (Intl Vendor) [EMAIL PROTECTED] writes:

 On Linux and probably most Unixes, the text and data segments
 of the executable are loaded page-by-page into memory on
 demand.  So having a lot of unused junk in the executable doesn't
 necessarily increase the memory used, either real memory or
 swap space.  

(Surely not swap, since code is just paged from the executable image :-)

Surely the executable itself is only linked with the functions
that are actually used by the program?  

Or are you talking about working set? I'd expect a GUI library to
have considerable initialization etc. code, which is used relatively
rarely.  In that case, the waste is mostly disk space, which is
abundant these days. 

My thought was, however, that the GUI toolkit is probably used by
multiple other (non-haskell) programs, and could be usefully shared. 

 I think it's basically harmless providing you're
 only running one instance of the program on the machine.

Uh, multiple instances would share the executable in memory, if I
understand things correctly.

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Simon Marlow

 Julian Seward (Intl Vendor) [EMAIL PROTECTED] writes:
 
  On Linux and probably most Unixes, the text and data segments
  of the executable are loaded page-by-page into memory on
  demand.  So having a lot of unused junk in the executable doesn't
  necessarily increase the memory used, either real memory or
  swap space.  
 
 (Surely not swap, since code is just paged from the 
 executable image :-)

Not if the executable comes from an NFS server - it might be quicker to
back it with swap space.  I believe this is what some OSs do (though not
Linux, I think).

 Surely the executable itself is only linked with the functions
 that are actually used by the program?  

Sure, except that this is done at the level of granularity of a module,
which when we're talking about Haskell code tends to be quite large.
This is where GHC's -split-objs comes in: it splits modules into smaller
units, which reduces the chances that you end up linking with something
you don't need.

Actually static linking is better for locality in the binary, because
the memory paged into the running executable is more likely to be
densely packed with useful functions.  Shared libraries only really come
into their own when there are multiple running programs using the same
library - otherwise they are a loss for both performance and memory use.

 Or are you talking about working set? I'd expect a GUI library to
 have considerable initialization etc. code, which is used relatively
 rarely.  In that case, the waste is mostly disk space, which is
 abundant these days. 
 
 My thought was, however, that the GUI toolkit is probably used by
 multiple other (non-haskell) programs, and could be usefully shared. 

The C code in GTK lives in shared libraries, which are still dynamically
linked against the executable when you use GTK+HS.  It's just the
Haskell code which is statically linked.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-09-29 Thread Willem Robert van Hage

  -rwxr-xr-x1 willem   users  201453 Sep 29 09:55 hello
  -rw-r--r--1 willem   users  51 Sep 29 09:54 hello.hs
  -rw-r--r--1 willem   users1468 Sep 29 09:55 hello.o
 
 Probably, the executable  hello  is 1000 times larger than object
 one because some piece of  library (including binary code for 
 outputting a string) is linked to it.
 In small user programs the library code is usually the larger 
 part. In large user programs, it will be the smaller part.

 Serge Mechveliani

I think so too, but what I'm surprised about is that GHC apparently
can't use printing functions and all those other things I saw in
`strings hello`, for example things about sockets, forks and file IO,
from a library instead of compiling it into the 'hello' binary.
In the case of a 200k hello world file that isn't a big problem,
but I want to do things with the gtk+hs toolkit and I've noticed that
even the simplest programs take up 2Mb. 'strings' on an executable
that uses gtk+hs show an endless list of unused functions.
Things like functions for radiobuttons while the only thing I use is
a label, a box and a window. I find that kind of strange.
I'm using GHC on a laptop with not a very big harddisk.

Doesn't GHC first check which functions actually get used and then
only link those to the binary?


Willem van Hage

-- 
[EMAIL PROTECTED] | http://www.xs4all.nl/~wrvh
[EMAIL PROTECTED] | http://quest.sourceforge.net

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users