Re: Looing for advice on profiling

2004-11-11 Thread John Meacham
On Wed, Nov 10, 2004 at 10:07:46AM +, Malcolm Wallace wrote:
> "Simon Marlow" <[EMAIL PROTECTED]> writes:
> 
> > On 09 November 2004 17:04, Duncan Coutts wrote:
> > 
> > >> Are you using BinMem, or BinIO?
> > > 
> > > BinIO
> > 
> > Ah.  BinIO is going to be a lot slower than BinMem, because it does
> > an hPutChar for each character, whereas BinMem just writes into an
> > array.  I never really optimised the BinIO path, because we use BinMem
> > exclusively in GHC.
> 

I have also done a port of the binary library to ghc6 as part of my
ginsu project and done some work on improving its efficiency, in
addition I have updated DrIFT such that it can derive both the old
bitwise nhc style binary as well as the new ghc style byte based binary.
(the byte based version which I use in ginsu is signifigantly faster).
The code can be gotten from ginsu in 
http://repetae.net/computer/ginsu/

I also have a much improved PackedString based on raw UTF8 in memory
with optimized unboxed folding routines which is designed to be very
fast to serialize with Binary. In ginsu, a switch from String ->
PackedString changed my memory footprint from 200megs to 10. quite a
nice improvement. 
John

-- 
John Meacham - ârepetae.netâjohnâ 
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Looing for advice on profiling

2004-11-10 Thread Simon Marlow
On 10 November 2004 10:08, Malcolm Wallace wrote:

> "Simon Marlow" <[EMAIL PROTECTED]> writes:
> 
>> On 09 November 2004 17:04, Duncan Coutts wrote:
>> 
 Are you using BinMem, or BinIO?
>>> 
>>> BinIO
>> 
>> Ah.  BinIO is going to be a lot slower than BinMem, because it
>> does an hPutChar for each character, whereas BinMem just writes into
>> an array.  I never really optimised the BinIO path, because we use
>> BinMem exclusively in GHC.
> 
> Is there a method in your Binary library to freeze a BinMem into a
> file all in one go?  The original nhc98 Binary library allows this
> (`copyBin'). 

Yup, it's called 'writeBinMem'.

Cheers,
Simon
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Looing for advice on profiling

2004-11-10 Thread Malcolm Wallace
"Simon Marlow" <[EMAIL PROTECTED]> writes:

> On 09 November 2004 17:04, Duncan Coutts wrote:
> 
> >> Are you using BinMem, or BinIO?
> > 
> > BinIO
> 
> Ah.  BinIO is going to be a lot slower than BinMem, because it does
> an hPutChar for each character, whereas BinMem just writes into an
> array.  I never really optimised the BinIO path, because we use BinMem
> exclusively in GHC.

Is there a method in your Binary library to freeze a BinMem into a file
all in one go?  The original nhc98 Binary library allows this (`copyBin').

Regards,
Malcolm
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Looing for advice on profiling

2004-11-10 Thread Simon Marlow
On 09 November 2004 17:04, Duncan Coutts wrote:

>> Are you using BinMem, or BinIO?
> 
> BinIO

Ah.  BinIO is going to be a lot slower than BinMem, because it does
an hPutChar for each character, whereas BinMem just writes into an
array.  I never really optimised the BinIO path, because we use BinMem
exclusively in GHC.

Cheers,
Simon
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Looing for advice on profiling

2004-11-09 Thread Duncan Coutts
On Tue, 2004-11-09 at 14:45, Simon Marlow wrote:
> On 09 November 2004 12:54, Duncan Coutts wrote:
> 
> [snip]
> > When I do time profiling, the big cost centres come up as putByte and
> > putWord. When I profile for space it shows the large FiniteMaps
> > dominating most everything else. I originally guessed from that that
> > the serialisation must be forcing loads of thunks which is why it
> > shows up so highly on the profile. However even after doing the
> > deepSeq before serialisation, it takes a great deal of time, so I'm
> > not sure what's going on.
> 
> let's get the simple things out of the way first: make sure you're
> compiling Binary with -O -funbox-strict-fields (very important).  When
> compiling for profiling, don't compile Binary with -auto-all, because
> that will add cost centres to all the small functions and really skew
> the profile.  I find this is a good rule of thumb when profiling: avoid
> -auto-all on your low-level libraries that you hope to be inlined a lot.

Ok, I was missing -funbox-strict-fields. I'll try that.

> You say your instances are created using DrIFT - I don't think we ever
> modified DrIFT to generate the right kind of instances for the Binary
> library in GHC, so are you using the instances designed for the nhc98
> binary library?  If so, make sure your instances are using put_ rather
> than put, because the former will allow binary output to run in constant
> stack space.

It's using put_

> Are you using BinMem, or BinIO?

BinIO

> > The retainer profiling again shows that the FiniteMaps are holding on
> > to most stuff.
> > 
> > A major problem no doubt is space use. For the large gtk/gtk.h, when I
> > run with +RTS -B to get a beep every major garbage collection, the
> > serialisation phase beeps continuously while the file grows.
> > Occasionally it seems to freeze for 10s of seconds, not dong any
> > garbage collection and not doing any file output but using 100% CPU,
> > then it carries on outputting and garbage collecting furiously. I
> > don't know how to work out what's going on when it does that.
> 
> I agree with Malcolm's conjecture: it sounds like a very long major GC
> pause.

Right, ok.

> > I don't understand how it can be generating so much garbage when it is
> > doing the serialisation stuff on a structure that has already been
> > fully deepSeq'ed.
> 
> Yes, binary output *should* do zero allocation, and binary input should
> only allocate the structure being created.  The Binary library is quite
> heavily tuned so that this is the case (if you compile with profiling
> and -auto-all, it will almost certainly break this property, though).

Yes, it's much better with optimisations. I'll try the
-funbox-strict-fields and report back.

Duncan


___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Looing for advice on profiling

2004-11-09 Thread Simon Marlow
On 09 November 2004 12:54, Duncan Coutts wrote:

[snip]
> When I do time profiling, the big cost centres come up as putByte and
> putWord. When I profile for space it shows the large FiniteMaps
> dominating most everything else. I originally guessed from that that
> the serialisation must be forcing loads of thunks which is why it
> shows up so highly on the profile. However even after doing the
> deepSeq before serialisation, it takes a great deal of time, so I'm
> not sure what's going on.

let's get the simple things out of the way first: make sure you're
compiling Binary with -O -funbox-strict-fields (very important).  When
compiling for profiling, don't compile Binary with -auto-all, because
that will add cost centres to all the small functions and really skew
the profile.  I find this is a good rule of thumb when profiling: avoid
-auto-all on your low-level libraries that you hope to be inlined a lot.

You say your instances are created using DrIFT - I don't think we ever
modified DrIFT to generate the right kind of instances for the Binary
library in GHC, so are you using the instances designed for the nhc98
binary library?  If so, make sure your instances are using put_ rather
than put, because the former will allow binary output to run in constant
stack space.

Are you using BinMem, or BinIO?

> The retainer profiling again shows that the FiniteMaps are holding on
> to most stuff.
> 
> A major problem no doubt is space use. For the large gtk/gtk.h, when I
> run with +RTS -B to get a beep every major garbage collection, the
> serialisation phase beeps continuously while the file grows.
> Occasionally it seems to freeze for 10s of seconds, not dong any
> garbage collection and not doing any file output but using 100% CPU,
> then it carries on outputting and garbage collecting furiously. I
> don't know how to work out what's going on when it does that.

I agree with Malcolm's conjecture: it sounds like a very long major GC
pause.

> I don't understand how it can be generating so much garbage when it is
> doing the serialisation stuff on a structure that has already been
> fully deepSeq'ed.

Yes, binary output *should* do zero allocation, and binary input should
only allocate the structure being created.  The Binary library is quite
heavily tuned so that this is the case (if you compile with profiling
and -auto-all, it will almost certainly break this property, though).

Cheers,
Simon
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Looing for advice on profiling

2004-11-09 Thread Malcolm Wallace
Duncan Coutts <[EMAIL PROTECTED]> writes:

> I'm looking for some advice on profiling and any suggestion on what
> might be going on with this program.

One suggestion might be to serialise (key,value) pairs to file as
they are first encountered, rather than waiting until they are all
inside FiniteMaps.  That would eliminate the time you are currently
spending on lookups.  (A subsequent run would then need to do the
insertion of binary (key,value)s, rather than having them already
ordered, but at least you save the textual parsing cost there.)

> A major problem no doubt is space use. For the large gtk/gtk.h, when I
> run with +RTS -B to get a beep every major garbage collection, the
> serialisation phase beeps continuously while the file grows.
> Occasionally it seems to freeze for 10s of seconds, not dong any garbage
> collection and not doing any file output but using 100% CPU, then it
> carries on outputting and garbage collecting furiously. I don't know how
> to work out what's going on when it does that.

One guess might be generational collection: fast beeps are for the
current generation, pauses are older generations?

Regards,
Malcolm
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users