John Meacham wrote:
On Wed, Jan 18, 2006 at 08:54:43PM +0300, Bulat Ziganshin wrote:
sorry, with the "gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops"
the C version is 50 times faster than best Haskell one... it's the
loop from C version:

I believe something similar to what I noted here is the culprit:
http://www.haskell.org//pipermail/glasgow-haskell-users/2005-October/009174.html

it is fixable, but not without modifying ghc.

Ah, I see what you mean by indirect jumps. Those indirect jumps go away if you compile with -optc-O2 or -fasm, they're droppings left by inadequacies in gcc's standard -O optimisation.

Actually, -fasm does better by one instruction than gcc on this example:

.globl Test_zdwfac_info
Test_zdwfac_info:
        movq (%rbp),%rax
        cmpq $1,%rax
        jne .LcmO
        movq 8(%rbp),%r13
        addq $16,%rbp
        jmp *(%rbp)
.LcmO:
        leaq -1(%rax),%rcx
        imulq 8(%rbp),%rax
        movq %rax,8(%rbp)
        movq %rcx,(%rbp)
        jmp Test_zdwfac_info

vs. gcc -O2:

Test_zdwfac_info:
.text
        .align 8
        movq    (%rbp), %rdx
        cmpq    $1, %rdx
        je      .L6
.L3:
        movq    8(%rbp), %rax
        imulq   %rdx, %rax
        decq    %rdx
        movq    %rdx, (%rbp)
        movq    %rax, 8(%rbp)
        jmp     Test_zdwfac_info
        .p2align 4,,7
.L6:
        movq    8(%rbp), %r13
        addq    $16, %rbp
        jmp     *(%rbp)


We should probably reverse the sense of that branch, like gcc does. The memory accesses are still there, of course. Hopefully someday I'll get around to trying to use more registers on x86_64 again.

Cheers,
        Simon
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to