Florian Klaempfl wrote:
Vincent Snijders schrieb:

Daniël Mantione wrote:


Op Fri, 6 Oct 2006, schreef Micha Nelissen:


Vincent Snijders wrote:

You could also start an assembler implementation of the matrix unit. I suppose using it is allowed, and a Tvector2_double looks a lot like such a double2.


Unless the compiler somehow helps, inlining the assembler implementation won't work and then the speedup might be lost again.


I started to add vector pascal like support, currently only i386/x86_64 are supported (no generic support). The whole (currently implemented) functionality is demonstrated by the following example. Please give some feedback if it allows benchmark speedups.

Thanks Florian, for starting the vector support.

I think this would help speedup in benchmarks. I cannot give real estimates how much, maybe 20 % or so.

There are some problems still some bugs (or things not implemented).

Given the following program:

var
  ad1,ad2,ad3 : array[0..1] of double;

begin
  ad2[0] := 1;
  ad2[1] := 3;
  ad3[0] := 9;
  ad2[1] := 12;
  ad1:=ad2+ad3;
  writeln(ad1[1]);
end.

It writes:
 0.000000000000000E+000

Looking at the assembler, I see the ad1 in the writeln is read from memory, but the ad1 is still only in the xmm0% register.

Further I encountered problems with the alignment.

Vincent

Vincent

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to