Re: enormous executable
Simon (talking about using -ffunction-sections) > And there's bound to be some > complication due to the assumptions we make in the RTS about the > relative ordering of code/data. Sounds like the mangler should do the function section magic. Assuming the mangler understands where section boundaries can and cannot go (I think this is true), this should be quite easy. If you run this: $ cat > /tmp/tst.c int f(int x) {return x;} int g(int x) {return x;} $ gcc -ffunction-sections -o - -S /tmp/tst.c You'll see that the -ffunction-sections flag causes gcc to output these section directives before the code implementing f and g. .section.text.f,"ax",@progbits .section.text.g,"ax",@progbits The corresponding GNU linker magic constructs the .text segment out of all the .text.* segments. -- Alastair Reid[EMAIL PROTECTED]http://www.cs.utah.edu/~reid/ ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
> I don't think the native ld on alpha-dec-osf3 supports such a feature, > so we would (I assume) have to leave -split-objs in ghc even if we do > implement -ffunction-sections/-fdata-sections. (Would it just be a > matter of enabling it when invoking gcc? Would I be able to try it on > my i386-linux box with -optc-ffunction-sections? Or I suppose the > mangler would need to be educated...) AFAIK it only works on ELF, and only with GNU ld. You could take advantage of it when doing unregisterised compilation, but otherwise we have to teach the mangler about it. And there's bound to be some complication due to the assumptions we make in the RTS about the relative ordering of code/data. Cheers, Simon ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
On 2001-10-03T10:37:24+1000, Manuel M. T. Chakravarty wrote: > "Simon Marlow" <[EMAIL PROTECTED]> wrote, > > > | Surely the executable itself is only linked with the > > > | functions that are actually used by the program? > > > AFAIUI the GNU linker is not clever enough to remove junk > > > on a per-function basis, only on a per-object basis. This is > > > why we do object-splitting -- by breaking libraries up into > > > thousands of .o files before rolling them into a .a, the > > > effectiveness of what GNU ld can do is enhanced. > > > Perhaps more recent GNU ld's do better on some platforms? > > > I have a vague recollection of some -gc-sections flag. > > Yup, but it needs compiler support. The idea is to get the compiler to > > put each function in its own section, then the linker removes unused > > sections from the linked image. > Sounds much better than the mess that -split-objs produces > on the harddisk. I don't think the native ld on alpha-dec-osf3 supports such a feature, so we would (I assume) have to leave -split-objs in ghc even if we do implement -ffunction-sections/-fdata-sections. (Would it just be a matter of enabling it when invoking gcc? Would I be able to try it on my i386-linux box with -optc-ffunction-sections? Or I suppose the mangler would need to be educated...) -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig PGP signature
RE: enormous executable
"Simon Marlow" <[EMAIL PROTECTED]> wrote, > > | Surely the executable itself is only linked with the > > | functions that are actually used by the program? > > > > AFAIUI the GNU linker is not clever enough to remove junk > > on a per-function basis, only on a per-object basis. This is > > why we do object-splitting -- by breaking libraries up into > > thousands of .o files before rolling them into a .a, the > > effectiveness of what GNU ld can do is enhanced. > > > > Perhaps more recent GNU ld's do better on some platforms? > > I have a vague recollection of some -gc-sections flag. > > Yup, but it needs compiler support. The idea is to get the compiler to > put each function in its own section, then the linker removes unused > sections from the linked image. Sounds much better than the mess that -split-objs produces on the harddisk. Cheers, Manuel ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
Mon, 1 Oct 2001 09:55:11 +0100, Simon Marlow <[EMAIL PROTECTED]> pisze: > We don't have support for shared libraries under Unix at the moment. It > has been investigated at various times in the past, and I believe the > story is that we couldn't do it without at least losing some performance > (more performance loss than you get from compiling C into a shared > library). There is also a big problem of binary incompatibility between different versions of ghc, and dependence on the exact version of modules our package depends on. Dynamic libraries work for C because C has a very stable ABI. Also it's easy to maintain binary compatibility in libraries. I'm afraid that cross-module optimizations of ghc make dynamic libraries nearly impractical. -- __("< Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÊPCZA QRCZAK ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
> | Surely the executable itself is only linked with the > | functions that are actually used by the program? > > AFAIUI the GNU linker is not clever enough to remove junk > on a per-function basis, only on a per-object basis. This is > why we do object-splitting -- by breaking libraries up into > thousands of .o files before rolling them into a .a, the > effectiveness of what GNU ld can do is enhanced. > > Perhaps more recent GNU ld's do better on some platforms? > I have a vague recollection of some -gc-sections flag. Yup, but it needs compiler support. The idea is to get the compiler to put each function in its own section, then the linker removes unused sections from the linked image. Cheers, Simon ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
| Surely the executable itself is only linked with the | functions that are actually used by the program? AFAIUI the GNU linker is not clever enough to remove junk on a per-function basis, only on a per-object basis. This is why we do object-splitting -- by breaking libraries up into thousands of .o files before rolling them into a .a, the effectiveness of what GNU ld can do is enhanced. Perhaps more recent GNU ld's do better on some platforms? I have a vague recollection of some -gc-sections flag. J ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
> "Julian Seward (Intl Vendor)" <[EMAIL PROTECTED]> writes: > > > On Linux and probably most Unixes, the text and data segments > > of the executable are loaded page-by-page into memory on > > demand. So having a lot of unused junk in the executable doesn't > > necessarily increase the memory used, either real memory or > > swap space. > > (Surely not swap, since code is just paged from the > executable image :-) Not if the executable comes from an NFS server - it might be quicker to back it with swap space. I believe this is what some OSs do (though not Linux, I think). > Surely the executable itself is only linked with the functions > that are actually used by the program? Sure, except that this is done at the level of granularity of a module, which when we're talking about Haskell code tends to be quite large. This is where GHC's -split-objs comes in: it splits modules into smaller units, which reduces the chances that you end up linking with something you don't need. Actually static linking is better for locality in the binary, because the memory paged into the running executable is more likely to be densely packed with useful functions. Shared libraries only really come into their own when there are multiple running programs using the same library - otherwise they are a loss for both performance and memory use. > Or are you talking about working set? I'd expect a GUI library to > have considerable initialization etc. code, which is used relatively > rarely. In that case, the waste is mostly disk space, which is > abundant these days. > > My thought was, however, that the GUI toolkit is probably used by > multiple other (non-haskell) programs, and could be usefully shared. The C code in GTK lives in shared libraries, which are still dynamically linked against the executable when you use GTK+HS. It's just the Haskell code which is statically linked. Cheers, Simon ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
"Julian Seward (Intl Vendor)" <[EMAIL PROTECTED]> writes: > On Linux and probably most Unixes, the text and data segments > of the executable are loaded page-by-page into memory on > demand. So having a lot of unused junk in the executable doesn't > necessarily increase the memory used, either real memory or > swap space. (Surely not swap, since code is just paged from the executable image :-) Surely the executable itself is only linked with the functions that are actually used by the program? Or are you talking about working set? I'd expect a GUI library to have considerable initialization etc. code, which is used relatively rarely. In that case, the waste is mostly disk space, which is abundant these days. My thought was, however, that the GUI toolkit is probably used by multiple other (non-haskell) programs, and could be usefully shared. > I think it's basically harmless providing you're > only running one instance of the program on the machine. Uh, multiple instances would share the executable in memory, if I understand things correctly. -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
| OTOH when you start using and reusing larger libraries, like | Gtk+, the waste accumulates, and it's possibly you don't need | to run too many GTK+HS programs before you see performance decrease | due to memory exhaustion. On Linux and probably most Unixes, the text and data segments of the executable are loaded page-by-page into memory on demand. So having a lot of unused junk in the executable doesn't necessarily increase the memory used, either real memory or swap space. I think it's basically harmless providing you're only running one instance of the program on the machine. J ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
"Simon Marlow" <[EMAIL PROTECTED]> writes: > We don't have support for shared libraries under Unix at the moment. It > has been investigated at various times in the past, and I believe the > story is that we couldn't do it without at least losing some performance Of course, dynamic linking only helps you if you run different programs[0] using the same library. My impression was that normally, Haskell programs link with the run time system and such, and unless you're running a lot of different Haskell programs, the gains aren't very large. OTOH when you start using and reusing larger libraries, like Gtk+, the waste accumulates, and it's possibly you don't need to run too many GTK+HS programs before you see performance decrease due to memory exhaustion. Probably not a big deal in the real world, but it may be occasionally worth it to take a slight performance decrease to avoid swapping. -kzm [0] Multiple instances of the same program share the code segment, of course. -- If I haven't seen further, it is by standing in the footprints of giants ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: enormous executable
> > > -rwxr-xr-x1 willem users 201453 Sep 29 09:55 hello > > > -rw-r--r--1 willem users 51 Sep 29 09:54 hello.hs > > > -rw-r--r--1 willem users1468 Sep 29 09:55 hello.o > > > > Probably, the executable hello is 1000 times larger than object > > one because some piece of library (including binary code for > > outputting a string) is linked to it. > > In small user programs the library code is usually the larger > > part. In large user programs, it will be the smaller part. > > > > Serge Mechveliani > > I think so too, but what I'm surprised about is that GHC apparently > can't use printing functions and all those other things I saw in > `strings hello`, for example things about sockets, forks and file IO, > from a library instead of compiling it into the 'hello' binary. We don't have support for shared libraries under Unix at the moment. It has been investigated at various times in the past, and I believe the story is that we couldn't do it without at least losing some performance (more performance loss than you get from compiling C into a shared library). So what you're seeing is static linking in action. We do have some hacks in our library building to "split" objects into small chunks so that static linking doesn't result in *huge* binaries (see the -split-objs flag), but this isn't being used in GTK+HS, hence the 2Mb binaries. Cheers, Simon ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
> > -rwxr-xr-x1 willem users 201453 Sep 29 09:55 hello > > -rw-r--r--1 willem users 51 Sep 29 09:54 hello.hs > > -rw-r--r--1 willem users1468 Sep 29 09:55 hello.o > > Probably, the executable hello is 1000 times larger than object > one because some piece of library (including binary code for > outputting a string) is linked to it. > In small user programs the library code is usually the larger > part. In large user programs, it will be the smaller part. > > Serge Mechveliani I think so too, but what I'm surprised about is that GHC apparently can't use printing functions and all those other things I saw in `strings hello`, for example things about sockets, forks and file IO, from a library instead of compiling it into the 'hello' binary. In the case of a 200k hello world file that isn't a big problem, but I want to do things with the gtk+hs toolkit and I've noticed that even the simplest programs take up 2Mb. 'strings' on an executable that uses gtk+hs show an endless list of unused functions. Things like functions for radiobuttons while the only thing I use is a label, a box and a window. I find that kind of strange. I'm using GHC on a laptop with not a very big harddisk. Doesn't GHC first check which functions actually get used and then only link those to the binary? Willem van Hage -- [EMAIL PROTECTED] | http://www.xs4all.nl/~wrvh [EMAIL PROTECTED] | http://quest.sourceforge.net ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: enormous executable
Willem Robert van Hage <[EMAIL PROTECTED]> writes > [..] > So I tried a simple hello world example exactly like the one > I saw somewhere on the GHC pages that compiled to a binary of 6kb > and I was surprised to see that even after stripping the binary > it still takes up 140kb. > > -rwxr-xr-x1 willem users 201453 Sep 29 09:55 hello > -rw-r--r--1 willem users 51 Sep 29 09:54 hello.hs > -rw-r--r--1 willem users1468 Sep 29 09:55 hello.o It shows that source program takes 51 bytes, compiled object program - 1468 bytes. This is not mauch. Probably, the executable hello is 1000 times larger than object one because some piece of library (including binary code for outputting a string) is linked to it. In small user programs the library code is usually the larger part. In large user programs, it will be the smaller part. Regards, - Serge Mechveliani [EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users