On Tue, 23 Jan 2007, John Meacham wrote:
I think the benchmarks are flawed in an important way, I believe, (but
am not positive) that ARPREC uses a special factorized form of
representing numbers, which makes multiplicative type operations
extremely fast, but simple things like addition/subtraction quite slow.

Oh, don't worry. I've given up on ARPREC for the simple reason that *every* operation on small-size operands is very slow to say nothing of the time it would take to initialise ARPREC every time. ARPREC does store numbers using a representation similar to floating point, essentially an array of doubles where:
        (int)(array[0]) = array_size
        (int)(array[1]) = no_mantissa_words
                array[2]        = exponent
                array[ 3 .. ] = mantissa
I hope the tests aren't "flawed," in that the main purpose of performing those operations was to see the median-level performance (extreme performance being Integers with operands greater than, 10,000 or 30,000 bits). The crumby graphs do show that I made a low cutoff of 256 bits so the most common Integer-use ( < 128 bits, 4 uint32_t) wasn't even tested. I probably didn't clarify why I tested larger size operands. GHC should have something that is comparable to GMP, and that means speed (and precision) for medium-large operands as well as small operands.

you are only benchmarking multiplicative or similar routines, giving
ARPREC a huge lead, when in practice it might end up being slowest, as
addition/subtraction are extremely more common than multiplication.

In the common case it is slowest--but not by much.

Also, how are you choosing the numbers to test with? it is possible that
some packages are using 'sparse' representations or other specialized
forms if all your numbers happen to be powers of two or something.

Random numbers--a different number for each iteration in size. I used a PRNG based on the SNOW2 stream cipher--something I wrote awhile ago, as fast as arc4random and tests well on DIEHARD and other statistical things. The GMP and OpenSSL random number generators were slower and I wanted to use the same generator across libraries.

also, pretty much all uses of integers will be for very small integers, we should be careful to not lose sight of speed for the common case due
to pretty asymtotic bounds.

All numbers were the same size, so cases like multiplying a 1024-bit operand by a 256-bit operand weren't tested; that could be a real flaw. It's all very sloppy--just to get an idea of where things generally line up.

So, where am I now? I worked on the GHC-Win off and on and then went back to making a uniform API between GHC and the replacement, and I am re-writing the replacement. I thought I would be done several weeks ago but of course little things take over... One area of GHC I would really like to change is the memory-use. ForeignPtr seems to work well but places the full burden of memory management on the Integer library; for SIMD-use (AltiVec, SSE2), the library would require GHC to allocate memory aligned to 16-bytes. Right now, the only choice would be allocatePinned() (in rts/sm/Storage.c), which is 4-byte aligned and it is, of course, pinned so the GC can't move it. Imagine the RTS-memory problems you could have if you wrote a Haskell program using lots of small Integers allocated with, say, an allocatePinned_16() (16-byte aligned). The alternative would be to use normal memory but re-align the operands for the vectorized operations by peeling; o.k., doable (especially easy with AltiVec), but slower. In the meantime I have been working through SSE2 assembler, which doesn't have an addc (add-carry) operation and doesn't set a flag for overflow, so I have been experimenting with a SWAR-like algorithm--essentially the same thing as GMP's Nails--to make the normally 32-bit operands 31 bits with the last bit for a carry flag. Thorkil Naur and others have suggested writing the whole thing as small assembler operations and piece them together in Haskell; I have been looking into that as well but it seems to entail inlining every Integer function--imagine the code bloat.

Cheers,
Pete

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to