Re: enormous executable

2001-10-04 Thread Alastair David Reid


Simon (talking about using -ffunction-sections)
> And there's bound to be some
> complication due to the assumptions we make in the RTS about the
> relative ordering of code/data.

Sounds like the mangler should do the function section magic.
Assuming the mangler understands where section boundaries can
and cannot go (I think this is true), this should be quite easy.

If you run this:
  
  $ cat > /tmp/tst.c
  int f(int x) {return x;}
  int g(int x) {return x;}
  $ gcc -ffunction-sections -o - -S /tmp/tst.c 
  
You'll see that the -ffunction-sections flag causes gcc to output these
section directives before the code implementing f and g.

  .section.text.f,"ax",@progbits
  .section.text.g,"ax",@progbits

The corresponding GNU linker magic constructs the .text segment out of
all the .text.* segments.

-- 
Alastair Reid[EMAIL PROTECTED]http://www.cs.utah.edu/~reid/

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-03 Thread Simon Marlow


> I don't think the native ld on alpha-dec-osf3 supports such a feature,
> so we would (I assume) have to leave -split-objs in ghc even if we do
> implement -ffunction-sections/-fdata-sections.  (Would it just be a
> matter of enabling it when invoking gcc?  Would I be able to try it on
> my i386-linux box with -optc-ffunction-sections?  Or I suppose the
> mangler would need to be educated...)

AFAIK it only works on ELF, and only with GNU ld.  You could take
advantage of it when doing unregisterised compilation, but otherwise we
have to teach the mangler about it.  And there's bound to be some
complication due to the assumptions we make in the RTS about the
relative ordering of code/data.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-02 Thread Ken Shan

On 2001-10-03T10:37:24+1000, Manuel M. T. Chakravarty wrote:
> "Simon Marlow" <[EMAIL PROTECTED]> wrote,
> > > | Surely the executable itself is only linked with the 
> > > | functions that are actually used by the program?  
> > > AFAIUI the GNU linker is not clever enough to remove junk
> > > on a per-function basis, only on a per-object basis.  This is
> > > why we do object-splitting -- by breaking libraries up into 
> > > thousands of .o files before rolling them into a .a, the
> > > effectiveness of what GNU ld can do is enhanced.
> > > Perhaps more recent GNU ld's do better on some platforms?
> > > I have a vague recollection of some -gc-sections flag.
> > Yup, but it needs compiler support.  The idea is to get the compiler to
> > put each function in its own section, then the linker removes unused
> > sections from the linked image.
> Sounds much better than the mess that -split-objs produces
> on the harddisk.

I don't think the native ld on alpha-dec-osf3 supports such a feature,
so we would (I assume) have to leave -split-objs in ghc even if we do
implement -ffunction-sections/-fdata-sections.  (Would it just be a
matter of enabling it when invoking gcc?  Would I be able to try it on
my i386-linux box with -optc-ffunction-sections?  Or I suppose the
mangler would need to be educated...)

-- 
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig

 PGP signature


RE: enormous executable

2001-10-02 Thread Manuel M. T. Chakravarty

"Simon Marlow" <[EMAIL PROTECTED]> wrote,

> > | Surely the executable itself is only linked with the 
> > | functions that are actually used by the program?  
> > 
> > AFAIUI the GNU linker is not clever enough to remove junk
> > on a per-function basis, only on a per-object basis.  This is
> > why we do object-splitting -- by breaking libraries up into 
> > thousands of .o files before rolling them into a .a, the
> > effectiveness of what GNU ld can do is enhanced.
> > 
> > Perhaps more recent GNU ld's do better on some platforms?
> > I have a vague recollection of some -gc-sections flag.
> 
> Yup, but it needs compiler support.  The idea is to get the compiler to
> put each function in its own section, then the linker removes unused
> sections from the linked image.

Sounds much better than the mess that -split-objs produces
on the harddisk.

Cheers,
Manuel

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-02 Thread Marcin 'Qrczak' Kowalczyk

Mon, 1 Oct 2001 09:55:11 +0100, Simon Marlow <[EMAIL PROTECTED]> pisze:

> We don't have support for shared libraries under Unix at the moment.  It
> has been investigated at various times in the past, and I believe the
> story is that we couldn't do it without at least losing some performance
> (more performance loss than you get from compiling C into a shared
> library).

There is also a big problem of binary incompatibility between different
versions of ghc, and dependence on the exact version of modules our
package depends on.

Dynamic libraries work for C because C has a very stable ABI. Also
it's easy to maintain binary compatibility in libraries. I'm afraid
that cross-module optimizations of ghc make dynamic libraries nearly
impractical.

-- 
 __("<  Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
 \__/
  ^^  SYGNATURA ZASTÊPCZA
QRCZAK


___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-02 Thread Simon Marlow

> | Surely the executable itself is only linked with the 
> | functions that are actually used by the program?  
> 
> AFAIUI the GNU linker is not clever enough to remove junk
> on a per-function basis, only on a per-object basis.  This is
> why we do object-splitting -- by breaking libraries up into 
> thousands of .o files before rolling them into a .a, the
> effectiveness of what GNU ld can do is enhanced.
> 
> Perhaps more recent GNU ld's do better on some platforms?
> I have a vague recollection of some -gc-sections flag.

Yup, but it needs compiler support.  The idea is to get the compiler to
put each function in its own section, then the linker removes unused
sections from the linked image.

Cheers,
Simon


___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Julian Seward (Intl Vendor)


| Surely the executable itself is only linked with the 
| functions that are actually used by the program?  

AFAIUI the GNU linker is not clever enough to remove junk
on a per-function basis, only on a per-object basis.  This is
why we do object-splitting -- by breaking libraries up into 
thousands of .o files before rolling them into a .a, the
effectiveness of what GNU ld can do is enhanced.

Perhaps more recent GNU ld's do better on some platforms?
I have a vague recollection of some -gc-sections flag.

J

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Simon Marlow

> "Julian Seward (Intl Vendor)" <[EMAIL PROTECTED]> writes:
> 
> > On Linux and probably most Unixes, the text and data segments
> > of the executable are loaded page-by-page into memory on
> > demand.  So having a lot of unused junk in the executable doesn't
> > necessarily increase the memory used, either real memory or
> > swap space.  
> 
> (Surely not swap, since code is just paged from the 
> executable image :-)

Not if the executable comes from an NFS server - it might be quicker to
back it with swap space.  I believe this is what some OSs do (though not
Linux, I think).

> Surely the executable itself is only linked with the functions
> that are actually used by the program?  

Sure, except that this is done at the level of granularity of a module,
which when we're talking about Haskell code tends to be quite large.
This is where GHC's -split-objs comes in: it splits modules into smaller
units, which reduces the chances that you end up linking with something
you don't need.

Actually static linking is better for locality in the binary, because
the memory paged into the running executable is more likely to be
densely packed with useful functions.  Shared libraries only really come
into their own when there are multiple running programs using the same
library - otherwise they are a loss for both performance and memory use.

> Or are you talking about working set? I'd expect a GUI library to
> have considerable initialization etc. code, which is used relatively
> rarely.  In that case, the waste is mostly disk space, which is
> abundant these days. 
> 
> My thought was, however, that the GUI toolkit is probably used by
> multiple other (non-haskell) programs, and could be usefully shared. 

The C code in GTK lives in shared libraries, which are still dynamically
linked against the executable when you use GTK+HS.  It's just the
Haskell code which is statically linked.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-01 Thread Ketil Malde

"Julian Seward (Intl Vendor)" <[EMAIL PROTECTED]> writes:

> On Linux and probably most Unixes, the text and data segments
> of the executable are loaded page-by-page into memory on
> demand.  So having a lot of unused junk in the executable doesn't
> necessarily increase the memory used, either real memory or
> swap space.  

(Surely not swap, since code is just paged from the executable image :-)

Surely the executable itself is only linked with the functions
that are actually used by the program?  

Or are you talking about working set? I'd expect a GUI library to
have considerable initialization etc. code, which is used relatively
rarely.  In that case, the waste is mostly disk space, which is
abundant these days. 

My thought was, however, that the GUI toolkit is probably used by
multiple other (non-haskell) programs, and could be usefully shared. 

> I think it's basically harmless providing you're
> only running one instance of the program on the machine.

Uh, multiple instances would share the executable in memory, if I
understand things correctly.

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Julian Seward (Intl Vendor)


| OTOH when you start using and reusing larger libraries, like 
| Gtk+, the waste accumulates, and it's possibly you don't need 
| to run too many GTK+HS programs before you see performance decrease 
| due to memory exhaustion. 

On Linux and probably most Unixes, the text and data segments
of the executable are loaded page-by-page into memory on
demand.  So having a lot of unused junk in the executable doesn't
necessarily increase the memory used, either real memory or
swap space.  I think it's basically harmless providing you're
only running one instance of the program on the machine.

J

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-10-01 Thread Ketil Malde

"Simon Marlow" <[EMAIL PROTECTED]> writes:

> We don't have support for shared libraries under Unix at the moment.  It
> has been investigated at various times in the past, and I believe the
> story is that we couldn't do it without at least losing some performance

Of course, dynamic linking only helps you if you run different
programs[0] using the same library.  My impression was that normally,
Haskell programs link with the run time system and such, and unless
you're running a lot of different Haskell programs, the gains aren't
very large. 

OTOH when you start using and reusing larger libraries, like Gtk+, the
waste accumulates, and it's possibly you don't need to run too many
GTK+HS programs before you see performance decrease due to memory
exhaustion. 

Probably not a big deal in the real world, but it may be occasionally
worth it to take a slight performance decrease to avoid swapping.

-kzm

[0] Multiple instances of the same program share the code segment, of
course. 
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



RE: enormous executable

2001-10-01 Thread Simon Marlow

> > > -rwxr-xr-x1 willem   users  201453 Sep 29 09:55 hello
> > > -rw-r--r--1 willem   users  51 Sep 29 09:54 hello.hs
> > > -rw-r--r--1 willem   users1468 Sep 29 09:55 hello.o
> > 
> > Probably, the executable  hello  is 1000 times larger than object
> > one because some piece of  library (including binary code for 
> > outputting a string) is linked to it.
> > In small user programs the library code is usually the larger 
> > part. In large user programs, it will be the smaller part.
> >
> > Serge Mechveliani
> 
> I think so too, but what I'm surprised about is that GHC apparently
> can't use printing functions and all those other things I saw in
> `strings hello`, for example things about sockets, forks and file IO,
> from a library instead of compiling it into the 'hello' binary.

We don't have support for shared libraries under Unix at the moment.  It
has been investigated at various times in the past, and I believe the
story is that we couldn't do it without at least losing some performance
(more performance loss than you get from compiling C into a shared
library).

So what you're seeing is static linking in action.  We do have some
hacks in our library building to "split" objects into small chunks so
that static linking doesn't result in *huge* binaries (see the
-split-objs flag), but this isn't being used in GTK+HS, hence the 2Mb
binaries.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-09-29 Thread Willem Robert van Hage

> > -rwxr-xr-x1 willem   users  201453 Sep 29 09:55 hello
> > -rw-r--r--1 willem   users  51 Sep 29 09:54 hello.hs
> > -rw-r--r--1 willem   users1468 Sep 29 09:55 hello.o
> 
> Probably, the executable  hello  is 1000 times larger than object
> one because some piece of  library (including binary code for 
> outputting a string) is linked to it.
> In small user programs the library code is usually the larger 
> part. In large user programs, it will be the smaller part.
>
> Serge Mechveliani

I think so too, but what I'm surprised about is that GHC apparently
can't use printing functions and all those other things I saw in
`strings hello`, for example things about sockets, forks and file IO,
from a library instead of compiling it into the 'hello' binary.
In the case of a 200k hello world file that isn't a big problem,
but I want to do things with the gtk+hs toolkit and I've noticed that
even the simplest programs take up 2Mb. 'strings' on an executable
that uses gtk+hs show an endless list of unused functions.
Things like functions for radiobuttons while the only thing I use is
a label, a box and a window. I find that kind of strange.
I'm using GHC on a laptop with not a very big harddisk.

Doesn't GHC first check which functions actually get used and then
only link those to the binary?


Willem van Hage

-- 
[EMAIL PROTECTED] | http://www.xs4all.nl/~wrvh
[EMAIL PROTECTED] | http://quest.sourceforge.net

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



Re: enormous executable

2001-09-29 Thread S.D.Mechveliani

Willem Robert van Hage <[EMAIL PROTECTED]> writes

> [..]
> So I tried a simple hello world example exactly like the one
> I saw somewhere on the GHC pages that compiled to a binary of 6kb
> and I was surprised to see that even after stripping the binary
> it still takes up 140kb.
>
> -rwxr-xr-x1 willem   users  201453 Sep 29 09:55 hello
> -rw-r--r--1 willem   users  51 Sep 29 09:54 hello.hs
> -rw-r--r--1 willem   users1468 Sep 29 09:55 hello.o

It shows that source program takes 51 bytes,
  compiled object program -  1468 bytes.
This is not mauch.
Probably, the executable  hello  is 1000 times larger than object
one because some piece of  library  
(including binary code for outputting a string) 
is linked to it.  
In small user programs the library code is usually the larger 
part.
In large user programs, it will be the smaller part.

Regards,

-
Serge Mechveliani
[EMAIL PROTECTED]








___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users