Vincent Snijders schreef:
I am trying out the sse2 support in fpc 2.1.1 and compile my code with -Cfsse2.

The program is the mandelbrot benchmark:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=fpascal

The innerloop is like this:
      Zr := 0;  Zi := 0; Tr := 0; Ti := 0;
      i := 0;
      while (i<50) and (Tr + Ti< limit) do begin
        inc(i);
        Zi := 2*Zr*Zi + Ci;
        Zr := Tr - Ti + Cr;
        Ti := Zi * Zi;
        Tr := Zr * Zr;
      end;

If I look at the generated assembler, I noticed that only %xmm0 is used and all each time the variable needs to be retrieved from the stack.

# Var Zr located in register mreg0md
# Var Zi located in register mreg0md
# Var Ti located in register mreg0md
# Var Tr located in register mreg0md
# Var Cr located in register mreg0md
# Var Ci located in register mreg0md

...

# [33] Zr := Tr - Ti + Cr;
    movsd    -88(%ebp),%xmm0
    subsd    -80(%ebp),%xmm0
    addsd    -96(%ebp),%xmm0
    movsd    %xmm0,-112(%ebp)
# [34] Ti := Zi * Zi;
    movsd    -72(%ebp),%xmm0
    mulsd    -72(%ebp),%xmm0
    movsd    %xmm0,-80(%ebp)

I thought there were more than one xmm registers. Why aren't they used?


They are not used, because the same body contains a procedure call and the %xmm0 registers are volatile. See also the notes with issue 7533:
http://www.freepascal.org/mantis/view.php?id=7533

Vincent
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to