Re[2]: [Haskell-cafe] Counting bits: Sanity Check

2006-04-13 Thread Bulat Ziganshin
Hello David,

Thursday, April 13, 2006, 12:55:05 AM, you wrote:

> Yes, especially curious since the algorithm is taken from AMD's
> optimization guide for the Athlon and Opteron series.  I'm not good  
> enough at reading core syntax to be able to see what GHC is doing  
> with it.

optimization for GHC is far away from low-level asm optimization so it
is not surprise that this don't work



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Counting bits: Sanity Check

2006-04-12 Thread David F. Place


On Apr 12, 2006, at 5:21 PM, Daniel McAllansmith wrote:



Averages of user time of five runs on an Athlon64 running 64bit linux:

64  0.1974
kern0.2980
ones32  0.3240
table32 0.3754
table   0.3798

64 looks to be a good bit faster.

You didn't change anything in the ones32 algorithm did you?  The other
algorithms are taking roughly what they did last time, but ones32  
seems

consistently slower now.


Thanks for running it.  Looks like "64" is worth the trouble.  If the  
32 bit wide algorithm is so fast,  the 12 bit wide one must be  
blazingly fast.  (See my other thread "The Marriage of Heaven and  
Hell.") I didn't make any changes to ones32 according to "diff."


Cheers, David


David F. Place
mailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Counting bits: Sanity Check

2006-04-12 Thread Daniel McAllansmith
On Thursday 13 April 2006 08:55, David F. Place wrote:
> On Apr 12, 2006, at 3:30 PM, Daniel McAllansmith wrote:
> > Averages of user time for three runs on an Athlon64 running 64bit
> > linux:
> >
> > kern0.29700
> > ones32  0.30733
> > table32 0.37333
> > table   0.38400
> >
> > I ran a whole lot more of kern and ones32... kern was consistently
> > faster than
> > ones32.  Curious.
>
> Yes, especially curious since the algorithm is taken from AMD's
> optimization guide for the Athlon and Opteron series.  I'm not good
> enough at reading core syntax to be able to see what GHC is doing
> with it.
>
> I wonder how this other crazy algorithm I found will work on your 64
> bit omputer.   It is much slower on my 32 bit PPC powerbook for
> obvious reasons.   If you'd like to try it, I'll include an updated
> BitTwiddle.hs .
>
> Usage:
> time ./bits 200 300 64

Averages of user time of five runs on an Athlon64 running 64bit linux:

64  0.1974
kern0.2980
ones32  0.3240
table32 0.3754
table   0.3798

64 looks to be a good bit faster.

You didn't change anything in the ones32 algorithm did you?  The other 
algorithms are taking roughly what they did last time, but ones32 seems 
consistently slower now.

Daniel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Counting bits: Sanity Check

2006-04-12 Thread David F. Place


On Apr 12, 2006, at 3:30 PM, Daniel McAllansmith wrote:

Averages of user time for three runs on an Athlon64 running 64bit  
linux:


kern0.29700
ones32  0.30733
table32 0.37333
table   0.38400

I ran a whole lot more of kern and ones32... kern was consistently  
faster than

ones32.  Curious.


Yes, especially curious since the algorithm is taken from AMD's  
optimization guide for the Athlon and Opteron series.  I'm not good  
enough at reading core syntax to be able to see what GHC is doing  
with it.


I wonder how this other crazy algorithm I found will work on your 64  
bit omputer.   It is much slower on my 32 bit PPC powerbook for  
obvious reasons.   If you'd like to try it, I'll include an updated  
BitTwiddle.hs .


Usage:
time ./bits 200 300 64



BitTwiddle.hs
Description: Binary data



David F. Place
mailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Counting bits: Sanity Check

2006-04-12 Thread Daniel McAllansmith
On Wednesday 12 April 2006 02:09, David F. Place wrote:
> If you'd like to give it a whirl on your fancy modern computers,

Averages of user time for three runs on an Athlon64 running 64bit linux:

kern0.29700
ones32  0.30733
table32 0.37333
table   0.38400

I ran a whole lot more of kern and ones32... kern was consistently faster than 
ones32.  Curious.

Daniel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Counting bits: Sanity Check

2006-04-12 Thread Robert Dockins


On Apr 11, 2006, at 10:09 AM, David F. Place wrote:


Hi All,

Since it seems that real applications need more than just  union,  
intersection, difference and complement to be fast to make EnumSet  
useful, I've been looking into the less naive approaches to the  
other things.   In particular, size seems to find itself in the  
inner loop.   I've made a comparison of various approaches to bit  
counting.  It seems I was too hasty to declare Bulat's suggestion  
of table lookup (table,table32) the winner.  It seems Robert's  
suggestion of Kernighan's (kern) method is faster.


I also implemented the method described in pages 187-188 of  
Software Optimization Guide for AMD Athlon™ 64 and Opteron™  
Processors. (ones32)  It's slower on my powerbook, but may be the  
winner on 64bit processors.  Here are the results:


[Marcel:~/devl/EnumSet] david% time ./bits 200 300 ones32
21
1.788u 0.136s 0:03.30 57.8% 0+0k 0+0io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 table
21
2.404u 0.164s 0:04.96 51.6% 0+0k 0+1io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 table32
21
2.067u 0.140s 0:04.27 51.5% 0+0k 0+0io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 kern
21
1.729u 0.137s 0:03.25 56.9% 0+0k 0+1io 0pf+0w

If you'd like to give it a whirl on your fancy modern computers,  
here's the code:




ghc -O3 -optc-O3 -o bits BitTwiddle.hs



I get similar results on my machine (PPC powerbook).  'ones32'   
barely edges out 'kern' and 'table' and 'table32' come in behind.


Average 'user' time over three runs:

ones32:  0.540s
kern: 0.545s
table: 0.730s
table32: 0.632s




Of course, if I've done something lame-brained and skewed the  
results, please let me know.


Cheers, David



David F. Place
mailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
  -- TMBG



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Counting bits: Sanity Check

2006-04-11 Thread David F. Place

Hi All,

Since it seems that real applications need more than just  union,  
intersection, difference and complement to be fast to make EnumSet  
useful, I've been looking into the less naive approaches to the other  
things.   In particular, size seems to find itself in the inner  
loop.   I've made a comparison of various approaches to bit  
counting.  It seems I was too hasty to declare Bulat's suggestion of  
table lookup (table,table32) the winner.  It seems Robert's  
suggestion of Kernighan's (kern) method is faster.


I also implemented the method described in pages 187-188 of Software  
Optimization Guide for AMD Athlon™ 64 and Opteron™ Processors.  
(ones32)  It's slower on my powerbook, but may be the winner on 64bit  
processors.  Here are the results:


[Marcel:~/devl/EnumSet] david% time ./bits 200 300 ones32
21
1.788u 0.136s 0:03.30 57.8% 0+0k 0+0io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 table
21
2.404u 0.164s 0:04.96 51.6% 0+0k 0+1io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 table32
21
2.067u 0.140s 0:04.27 51.5% 0+0k 0+0io 0pf+0w
[Marcel:~/devl/EnumSet] david% time ./bits 200 300 kern
21
1.729u 0.137s 0:03.25 56.9% 0+0k 0+1io 0pf+0w

If you'd like to give it a whirl on your fancy modern computers,  
here's the code:




BitTwiddle.hs
Description: Binary data


ghc -O3 -optc-O3 -o bits BitTwiddle.hs


Of course, if I've done something lame-brained and skewed the  
results, please let me know.


Cheers, David



David F. Place
mailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe