Florian Klaempfl wrote:
Vincent Snijders schrieb:
Daniël Mantione wrote:
Op Fri, 6 Oct 2006, schreef Micha Nelissen:
Vincent Snijders wrote:
You could also start an assembler implementation of the matrix unit.
I suppose using it is allowed, and a Tvector2_double looks a lot like
such a double2.
Unless the compiler somehow helps, inlining the assembler
implementation won't work and then the speedup might be lost again.
I started to add vector pascal like support, currently only i386/x86_64
are supported (no generic support). The whole (currently implemented)
functionality is demonstrated by the following example. Please give some
feedback if it allows benchmark speedups.
Thanks Florian, for starting the vector support.
I think this would help speedup in benchmarks. I cannot give real
estimates how much, maybe 20 % or so.
There are some problems still some bugs (or things not implemented).
Given the following program:
var
ad1,ad2,ad3 : array[0..1] of double;
begin
ad2[0] := 1;
ad2[1] := 3;
ad3[0] := 9;
ad2[1] := 12;
ad1:=ad2+ad3;
writeln(ad1[1]);
end.
It writes:
0.000000000000000E+000
Looking at the assembler, I see the ad1 in the writeln is read from
memory, but the ad1 is still only in the xmm0% register.
Further I encountered problems with the alignment.
Vincent
Vincent
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel