RE: Inliner behaviour - tiny changes lead to huge performance differences

Simon Peyton-Jones Tue, 17 Nov 2009 00:24:57 -0800

Bryan

It’s good news that the HEAD is better.


To be honest I’m not terribly enthusiastic about trying to nail down exactly 
what’s happening in 6.10 and 6.12 because, although they are indeed the 
compilers people will be using, it’s otherwise wasted work because the HEAD is 
so different.

Can you try with 6.12 and see if you can find a recipe that does well enough?  
If you get desperate (ie there’s a huge perf bump that you can’t eliminate) 
then I’ll certainly try to help.

Meanwhile, I don’t know why 6.10 is faster than HEAD (by 25% too) and I’d like 
to understand that.  Can you submit a Trac ticket saying how to reproduce?  You 
might need to bundle up the library too, to make sure we can reproduce it 
precisely.

Thanks

Simon

From: Bryan O'Sullivan [mailto:b...@serpentine.com]
Sent: 17 November 2009 07:14
To: Simon Peyton-Jones
Cc: glasgow-haskell-users@haskell.org
Subject: Re: Inliner behaviour - tiny changes lead to huge performance 
differences

On Fri, Nov 13, 2009 at 12:26 AM, Simon Peyton-Jones 
<simo...@microsoft.com<mailto:simo...@microsoft.com>> wrote:

My goal is for INLINE pragmas to be very predictable.  I can't decode your 
message enough to offer any insights; thank you Roman, who is closer to it, for 
helping.

Things are considerably different with HEAD than with 6.10.4. HEAD is indeed 
spotting and exploiting many of the opportunities for inlining, while 6.10.4 is 
a bit of a morass. The difference is stark: my test program runs in 0.7 seconds 
with HEAD, and 1.2 with 6.10.4.

Here's a rough table of my results:

6.10.4   8.39 seconds
HEAD     0.50
HEAD*    0.50
6.10.4*  0.39
6.10.4** 0.34

The asterisk above denotes the removal of a single INLINE pragma from the text 
library.
The doubled asterisk denotes the removal of a piece of indirection: instead of 
length defined as lengthI and both marked as INLINE, I manually inlined lengthI 
into the body of length.

For your amusement, GNU "wc -m" takes 1.1 seconds to count the number of 
Unicode characters in the same file, so I think that our combination of 
performance and brevity is wonderful. Thanks!

So HEAD is far better than 6.10.4 (yay!), but a little tweaking of the library 
code makes the 6.10.4 code faster again (boo!). The HEAD inliner seems, as you 
hoped, to be behaving far more predictably than its predecessor.

If you'd like to investigate the remaining performance discrepancy between 
6.10.4 and HEAD, I'll create a Trac ticket with instructions on how to 
reproduce my numbers.

In the time between now and the release of 6.14, I wonder what to do. I'm 
building 6.12 to see how it fares, but my experience with 6.10 so far suggests 
that the behaviour of the 6.12 inliner will be fragile and difficult to 
understand, which is a bit of a shame. On that older code base, it seems that I 
can get really good fused performance, or okay unfused performance, but not 
both.

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

RE: Inliner behaviour - tiny changes lead to huge performance differences

Reply via email to